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Chapter 1 
INTRODUCTION 



This is a study about inference. Traditionally, epistemology has investigated 
inference from an introspective point of view. From this perspective, inference 
is what is going on when I consciously believe that a is true and when this 
is the reason for which I start to believe consciously that (3 is true as well. 
An inference like that is justified if my belief that a is true is a good reason 
for my belief that (3 is true, i.e., if a conscious, internal or external, rational 
argumentation has caused me to believe f3 on the basis of believing a. This is 
usually called an internalist concept of justification. The arguments that are 
used for such a reasoning should be logically valid ones, in a broad sense of 
‘logical’ which includes inductive logic, probability theory, etc. The ideal rea- 
soner’s laws of thought are thus identical to logical laws, since the ideal agent 
explicitly uses logic as the method of reasoning. More recently, this “Boolean 
dream” has turned into the development of artificial cognitive architectures 
that draw inferences in the same way as logicians, mathematicians, or scien- 
tists in general do, when they present a proof on the blackboard: i.e., by the 
manipulation of symbols. Starting from a traditional “internalist” view of justi- 
fication, we arrive at a view of inference on the “high level” ; the agents drawing 
such inferences may be called ‘high-level agents’. By ‘level’ we mean the level 
of the cognitive complexity of an agent or an inference, but also the level of the 
epistemological standards that are to be applied. 

This is not a study about high-level inference. In contrast to traditional 
epistemology, we are primarily interested in those inferences we are not con- 
sciously aware of. Although such inferences are part of common sense, there 
is nothing specifically human about such inferences. We humans share them 
with the cat nearby, which infers that the bird which she sees picking grains 
from the dirt, is able to fiy. In a similar way, the logician - while pondering 
on whether P=NP - automatically infers that the car in front of him will turn 
right at the next crossing, because it is signalling to do so. Inferences like these 
are again justified if some belief is a good reason for another belief, but now 
this just means that the process that is linking the former belief with the latter 
is reliable. There is also nothing rational or scientific about justified common 
sense inferences. Neither the cat, nor the car-driving logician are implicitly 
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or explicitly arguing for their common sense conclusions. The ideal reasoner 
is subserved perfectly by her inferential capacities, but she is not necessarily 
a rational being. Logic still plays a role as a philosophical tool that may be 
employed, beside others, for the explication of justification; but it is not neces- 
sarily used as a cognitive means of drawing justified inferences. Perhaps it is not 
even possible to be used for such purposes if we think of practical possibility. 
As the lessons from artificial intelligence have taught us, it is relatively easy 
to program a cleverly reasoning chess-playing computer, but it is difficult to 
make a robot justifiedly infer what is not going to change when the robot puts 
the red block on top of the blue one. It is these primitive inferences that have 
stimulated both the development of new logical methods and a new paradigm 
of cognition by connectionist networks. It is these primitive inferences that we 
want to investigate from a philosophical perspective. Thus, starting from an 
‘‘externalist” conception of justification we arrive at a view on inference on the 
“low level”, and the agents drawing such inferences may be called ‘low-level 
agents’. 

Human agents may be studied from a high-level and from a low-level 
point of view. They are neither purely high-level agents, nor purely low-level 
agents. Hence, when we concentrate on low-level agents, this may just as well 
be regarded as a study of some low-level aspects of human cognition. 

Since philosophy of mind has been strongly influenced by the high-level 
view on belief and inference, we have first to free ourselves from any high-level 
prejudices that may go along with terms like ‘belief’ and ‘inference’. In part I, 
we therefore state a general explication of the notion of inference that is not 
committed to any high level view on inference, nor, in fact, to any particular 
view on inference. We restrict ourselves to a small but non- trivial subclass of 
inferences, which enables us to work out the explication in considerable detail. 
This subclass consists of inferences from a belief that a[a] is true to a belief 
that (3[a\ is true, where a[a] and j3[a] are singular sentences referring to a sin- 
gle object a. We can divide these inferences into monotonic and nonmonotonic 
ones. Important subclasses of both kinds of inference are in the first case the 
deductive inferences, in the second case the high probability inferences and 
the normic inferences. Our explication of the notion of inference is exclusively 
in causal, system-theoretic terms, without any normative component. On the 
other hand, we leave open whether the notion of belief that we presuppose has 
such a normative dimension (although we doubt it). We give no explicatory 
definition of ‘belief’, but we state some general properties of beliefs and we dis- 
tinguish between different kinds of belief: premise beliefs and conclusion beliefs 
of the inferences that we consider are classified as singular occurrent beliefs; 
those may be perceptual or central state beliefs; in the case of nonmonotonic 
inference, the premise beliefs are total belief states. As we will see, every such 
inference is based on general dispositional beliefs the contents of which are 
expressed by conditionals of the form a[x] ^ (5[x\. These general beliefs are 
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neither among the premises of the inference, nor identical to the conclusion 
of the inference, but rather correspond to internalized rules of inference which 
link the premises to the conclusions. The semantics of => determines the type 
of the general belief and also the type of the inference based on the general 
belief. may be strict, or it may be defeasible; in the first case this corre- 
sponds to the class of monotonic inferences, in the second case to the class of 
nonmonotonic inferences. E.g.: (i) the common sense inference from Bird(a) 
to CanFly(a) is nonmonotonic, (ii) its premise belief is a total belief, (hi) it 
is based on the general dispositional belief that Bird(x) => CanFly(x) is true, 
(iv) ^ is a defeasible implication sign which expresses ‘if . . . then normally 
. . .’, (v) Bird(x) CanFly(x) is thus a normic (“normality”) law and (vi) the 
inference therefore is normic. 

In part II we outline first the internalist and then the corresponding 
externalist notions of justified belief and inference, and we compare them. We 
argue that the notion of justification that is appropriate for the inferences of 
a low-level agent must be an externalist one. If we say that the cat justifiedly 
infers that the bird right in front of her is able to fly, we use precisely such a 
notion of justified inference. Thus, we develop an externalist reliabilist theory 
of justified inference for low-level agents, first in informal, then in formal terms. 
We discuss how the typical problems that affect reliabilist theories can (as we 
hope) be avoided. A closer examination of the concept (s) of reliability leads 
us to the conclusion that a reliabilist theory of justified low-level inference has 
to be based on a qualitative notion of reliability, not on a quantitative notion. 
E.g., if we defined reliability in terms of precise probabilities, this would leave 
us with a notion of justification s.t. no low-level agent would ever be able to 
draw a large set of justified inferences in realistic time. However, since low- 
level agents are able to do so - as we assume - reliability has to be defined 
qualitatively. For the same reason, a low-level agent which only draws justified 
inferences according to qualitative reliability may nevertheless be irrational. 
The justification of monotonic inferences is suggested to be defined by abso- 
lute qualitative reliability, the justification of nonmonotonic inferences by high 
qualitative reliability; ‘high qualitative reliability’ may either mean high prob- 
abilistic reliability or high normic reliability. Our low-level theory of justified 
inference remains incomplete in so far as the induction problem affects the jus- 
tification of the acquisition of those general beliefs on which the inferences that 
we consider are based. 

Part III is concerned with the logic of the expressions that we have used 
in order to define qualitative reliability: the logic of universal conditionals, the 
logic of high probability conditionals, and the logic of normic conditionals. 
The corresponding semantical and proof-theoretical systems are recalled or 
introduced, and they are related to each other in terms of their logical strength 
as well as in terms of soundness and completeness theorems. Some consequences 
for our theory of justified inference are presented. 
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In part IV we prove the existence of powerful ideal low-level agent ar- 
chitectures; when we say ‘ideal’, this is meant just with respect to low-level 
inferences but not with respect to other cognitive processes or capacities (like 
learning, etc.). This result is shown by applying the definitions and results of the 
first three parts. In principle, there are two kinds of cognitive architectures that 
suggest themselves as subserving an ideal agent: a symbolic computation archi- 
tecture, or a dynamical systems architecture. We argue that a powerful ideal 
high-level agent which uses symbolic computation for low-level inferences is the- 
oretically possible, but presumably suffers from serious practical constraints. 
On the other hand, we can prove that there is an ideal powerful low-level agent 
architecture which does not suffer from the same constraints. The architecture 
is based on a simple dynamical system with a network structure. Within this 
structure, the contents of singular occurrent beliefs are represented by patterns 
of unit activation, and the contents of general defeasible dispositional beliefs 
are represented by the network topology of the agent. The contents of general 
strict dispositional beliefs are represented by properties of activation patterns. 
Nonmonotonic inferences are state-transitions from a total premise belief to a 
stable conclusion belief. The network topology is regarded to be fixed, and thus 
an agent with such an architecture is not able to learn. On the other hand, an 
agent with such an architecture is shown to draw inferences precisely according 
to our theory of justified inference, if only a particular version of high normic 
reliability is used. Some possible objections to this result are discussed and the 
architecture of our ideal agent is compared to other approaches in Nonmono- 
tonic Reasoning and to the usual artificial neural networks. It is a corollary of 
our results that there is also an artificial neural network which is ideal. 

In the appendix we analyze parts of the ontology that we have pre- 
supposed for our theory, we give an outline of Goldman’s reliability account 
of justified belief, and we show how the results of part IV can be extended to 
notions of justification which are based on different versions of high qualitative 
reliability. 

We have tried to keep this text rather self-contained, in particular 
because methods and theories of fields as diverse as the philosophy of mind, 
epistemology, logic, and cognitive science are used. 

Since human agents may be studied on the high level and on the low 
level, a “complete” philosophy of human inference has to deal with (i) the high- 
level aspects, (ii) the low-level aspects, and (hi) how they are related. This study 
is a contribution to (ii). Part (hi) demands: (ih.i) as far as the explication of the 
notions of belief and inference is concerned: a theory that shows how conscious 
mental states and processes are related to unconscious ones; (iii.ii) as far as the 
justification of belief and inference is concerned: a mixed theory of internalist 
and externalist justification; (iii.iii) as far as the cognitive architecture of ideal 
or real agents is concerned: a hybrid theory of representation that explains 
how symbolic representations may “emerge” from distributed ones. Logical 
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concepts, methods, and results may be used to establish the common framework 
for each of these parts. In this way, psychologistic ideas may get a revival, but 
in a cautious and purified form; perhaps, logic is to be reoriented away from 
mathematics towards the cognizers. (Cf. Gabbay& Woods [56]) 
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Chapter 2 
PRELIMINARIES 



Both philosophers of mind and epistemologists sometimes develop their theories 
on a level of generality so high that they lose sight of the ground. In order to 
avoid a similar pitfall, we will develop our thoughts on inferences and their 
justification only for a specific class of cognitive agents and for a specific class 
of epistemic situations; both of these classes are defined by certain (simplifying) 
assumptions which we will state explicitly. Our theoretical account of inference 
is thus only meant to hold if also these assumptions are met. We will put the 
constraints in terms of a little story, or a sandtable exercise, by which we will 
try to illustrate the more abstract issues. We will try to keep things as simple 
as possible, but at the same time we will try to be as precise as necessary. 

Consider a territory (call it the “sandbox territory”) which is large 
enough to function as an appropriate epistemological laboratory and which 
is both realistic and representative. Suppose that in this territory there are 
various objects; in view of the typical examples in Nonmonotonic Reasoning 
let us assume that these objects include birds, and that among the birds there 
are also penguins. Let Dact be the fixed (finite) set of all objects in this territory. 
We use ‘d’ (with or without indices) as a variable ranging over the members of 
this “actual” domain Dact^ or over the members of other domains. 

Now let us now introduce a (quite simple) language C by which we can 
describe the sandbox territory. The vocabulary of C consists of: 

• (The vocabulary of C) 

1. Finitely many unary predicates, among which there are the predi- 
cates Bird^ CanFly^ Wings, Penguin, Large, Red ,. . . 

2. an individual constant a 

3. the usual logical connectives -i. A, V, of classical propositional 

logic 

4. and parentheses. 

Let C be the language that we get when we apply the usual induc- 
tive definition of formulas to our vocabulary. C is thus essentially a proposi- 
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tional language. E.g., £ contains the formulas Bird{a)^ ->Penguin{a)^ Bird{a)A 
Penguin{a), and so on. 

We use 'a\ '7’ as metavariables ranging over the the 

formulas of £, but we will use the same metavariables as ranging also over 
the formulas that are contained in the further languages that we are going 
to define. ‘P’, ‘Q’ are metavariables for predicates. We will refer to C as the 
^factual (object) language’. 

An interpretation 3 of C relative to a non-empty domain P is a function 
which maps each predicate P in the vocabulary of £ to a subset 3{P) of D. We 
fix some “actual” interpretation 3act relative to Dact that is defined the way 
that 3act{Bird) is the set of birds in Dact, 3act{CanFly) is the set of objects 
in Dact which are capable of fiying, 3 act {Wings) is the set of entities in Dact 
which have wings, 3act{Penguin) is the set of penguins in Dact, 3act{Large) is 
the set of large objects in Dact, 3act{R^d) is the set of red objects in Dact, etc.; 
call 3 act “intended” . 

Given that a denotes some distinguished object d G P for a given do- 
main £), we can define truth for the formulas (/? G £ relative to an interpretation 
mapping 3 in the usual way: 

Definition 1 (Satisfaction for Factual Formulas) 

Let 1= he inductively defined by: 

1. (3,d)NP(a) iffde3{P) 

(for all predicates P in the vocabulary of C) 

2. for all complex G £; 

(3,d) ^ (f iff the usual recursive clauses for formulas with logical connec- 
tives apply to ip. 

We will furthermore restrict ourselves only to interpretation mappings 
3 and domains Z), s.t., for every object d E D there is a so-called object- 
description o{d) of d, i.e.: for every d there is a formula o{d) G £, s.t. {3^d) N 
o{d) and there is no d' G Dact, s.t. d' ^ d and (3, d') t= o{d). This assumption 
entails that £ is sufficiently expressive to distinguish between all the members 
of Z), since for every object d in D there is a formula of £ which is satisfied by 
d and only d. In particular, this is supposed to be the case also for the domain 
Dact of our epistemological sandbox territory. 

The sentences of £ are singular sentences which speak about the object 
denoted by the singular term a. In various of the subsequent chapters we will 
also make use of general sentences in order to express general properties of 
the sandbox territory. In order to do so, we add an individual variable x to 
our vocabulary. We will restrict ourselves to general sentences of the special 
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form a[x] P[x] or a[x] /3[x]*, s.t. for some formulas a[a], (3[a\ G C, 
a[x] is the formula we get by substituting x for a in a [a], and analogously 
for P[x] and P[a]^ . Adams [2] refers to such formulas as ‘x- formulas’ (which are 
also used by Schurz[147]). We will speak of and in such a context as 
‘implication signs’, or ‘conditional signs’; we refer to a[x] j3[x] and a[x] 

(i[x] as ‘general sentences’, ‘implications’, ‘conditionals’, and sometimes maybe 
even as ‘law-like sentences’ or ‘laws’. Let C-^ fbe set of all such general 

sentences a[x] /3[x] {a[x] => P[x])] note that is a “pure” language 

of conditionals which excludes all applications of propositional connectives to 
such conditionals, i.e.: neither contains negations of conditionals, nor 

conjunctions of conditionals, etc. 

As it is well known in the philosophy of science, the common distinction 
between singular sentences, general sentences, and law-like sentences is prob- 
lematic (see, e.g., Stegmiiller[169], chapter V), and only necessary conditions 
for “generality” are usually formulated, like: a general sentence is not logically 
equivalent to a singular one, and so forth. For our purpose, it is sufficient to 
define singular sentences as the members of £, to define general sentences as 
the formulas of the form a[x] P[x] or a[x] => /3[x], and to regard the class 
of law-like sentences as a subclass of general sentences with certain additional 
properties that we will mainly leave unspecified. Note that law-like sentences 
are not necessarily meant to be true sentences. If a law-like sentence is true, 
we simply call it a ‘law’ (following Stegmuller[169], p.273, who in turn follows 
Goodman in this respect). The connectives ^ and => should be considered as 
binding x, i.e., they are actually quantifiers, but since we only use one individual 
variable x, this is formally not so important. 

By introducing the notion of a model we can make explicit what it 
is that makes a general sentence of a special type true; we say that 9Jt N 
a[x] j3[x] (97T 1= a[x] ^ /3[x]) iff the model DXt satisfies the general sentence 

a[x] P[x] {a[x] => f3[x]). We will not give any details about such at this 

point. dJl contains an interpretation mapping like 3 above - we say that DJI is 
based on J - but additionally it could have some probability measure or some 
accessibility relation as its component (compare part III of this study). 

Furthermore, we fix an “actual” model DJlact components of which 
are defined by the properties of our epistemological sandbox territory; e.g.: 
DJlact contains 3 act as its component, and thus we say that DJlact is based on 
3act^ 

In what follows, we will distinguish between two kinds of general sen- 
tences, which may be characterized by the properties of the implication sign 
they employ. 

*The distinction between — ^ and is explained below. and are metavariables 
ranging over binary connectives. 

’I’ In a context where we both consider general sentences and sentences in C, we will denote 
the latter sentences by explicit reference to the constant a, and say, e.g., ‘a [a]’ instead of just 
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Let imp be some implication sign: 

Definition 2 (Monotonic and Nonmonotonic Implication Signs) 

1. imp is monotonic iff 

for all models Oil (of the given class of models), for all a[x] imp p[x] G 

r. 

^imp • 

i/ t= a[x] imp f3[x], then 971 N a[x] A ^[x] imp P[x]^ 

2. imp is nonmonotonic iff imp is not monotonic, i.e., 

there is a model dJl (of the given class of models ), 

there are a[x] imp (3[x], a[x] A^[x] imp P[x] G Cimp, s.t.: 

9Jl 1= a[x] imp (3[x\, but 9Jli^ a[x] A ^[x] imp P[x]. 

In the following, we will use the metalinguistic symbol to range 
just over monotonic implication signs; is used to range just over nonmono- 
tonic implication signs. Thus, for every choice of a monotonic is the 

corresponding set of monotonic conditionals a[x] p[x]; accordingly, for 
and nonmonotonic conditionals a[x] p[x] (where a[a], P[a] G C). Let be 
an individual metaconstant that denotes the material implication (note that 
or, rather, it abbreviates a combination of universal quantification 
and material implication, i.e., a[x] — ^ /3[x] is to be read as ‘all as are ^s’. ^ is 
monotonic, of course; the corresponding class of models is the class of models 
of restricted first-order predicate logic (compare part III) . 

So we can define: 

Definition 3 (Strict and Defeasible Conditionals) 

1. A sentence is a strict (^‘hard^’) general sentence iff it is of the form a[x] 

P[x]. 

2. A sentence is a defeasible (‘‘soft”) general sentence iff it is of the form 

a[x] => p[x]. 

Strict conditionals remain true if their antecedents are strengthened; 
defeasible conditionals do not necessarily. Note that our usage of the term 
‘strict’ has nothing to do with strict implication in modal logic, i.e., with C.I. 
Lewis’ binary necessary-implication operator; but, of course, conditionals em- 
ploying Lewis’ strict implication sign are instances of strict conditionals in our 
sense of the word. Defeasibility has been studied by Chisholm[34] and Pollock 
(for an introduction see Pollock[125]) in an epistemological context, and by 
Nute (see e.g. [116]) on the basis of his work on conditional logic. 

^We always adopt the convention that the connectives of propositional logic bind stronger 
than the additionally introduced implication signs. 
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We want to leave open at this point how the interpretation of general 
sentences and singular sentences “interdepend” , because this in turn depends 
on the semantics of the implication signs that might be denoted by or 
But we do assume at least the following: 

• (Assumption on Strict and Defeasible Conditionals) 

Let ^ be binary connectives that we might consider; let o:[a], (3[a] G £: 

if a[a] is an object-description of d, if 9}t is based on 3, and if OT N a[x] 
/3[x] or OT 1= a[x] /3[x], then also (J, d) N /3[a]. 

Thus we assume that the general sentences that we refer to subserve 
a special combination of universal instantiation and modus ponens, if a[a] de- 
scribes d uniquely. This is a kind of counterpart to Carnap’s ‘requirement of 
total evidence’ or Hempel’s ‘rule of maximum specifity’ in their theories of in- 
duction (see our short discussion at the end of section 8.5), but now on the 
level of satisfaction for conditionals. 

If 9Jt 1= a[x] => /3[x], we call all objects d, s.t. (3,d) N a[a] A ~'/?[a], 
‘exceptions’ to the general sentence a[x] => P[x\. Strict general sentences may 
be shown to be exceptionless: 

Remark 4 Strict general sentences a[x] — ^ P[x] do not have exceptions. 

Proof: 

Consider any object d E D: ifdJl\= a[x] /3[x] then also 
971 N q;[x] A o{d)[x] (3[x] by monotonicity, where o{d)[a] = o{d) is one object- 

description of d as defined above. Now assume that (3,d) 1= a[a] A -i^[a].- in 
this case, a[a] Ao{d)[a] is certainly also an object- description of d, since o{d)[a] 
is an object- description of d, and (3,d) N a[a] by assumption; thus, by 3Jl \= 
a[x] A o{d)[x] P[x] and the assumption on strict and defeasible conditionals 
above, we have {3,d) N (3[a\, which is a contradiction. ■ 

For the same reason, we have: 

Remark 5 If (3,d) N a[a] and 971 N a[x] /S[x], then also (3,d) \= /3[a]. 

I.e.: in the case of strict general sentences we thus have the above com- 
bination of universal instantiation and modus ponens right out, independently 
of whether a[a] is an object-description or not. 

E.g., ‘penguins are birds’ is a strict law, since also dead penguins are 
birds, pink penguins are birds, etc.; but ‘birds can fly’ is a defeasible law, 
since penguin birds cannot fly, and thus ‘birds can fly’ is defeated by penguins; 
penguins are exceptions to the “soft” law ‘birds can fly’. 

Note that we will also use ‘a’, ‘/S’, ‘7’ as metavariables ranging 

over the formulas of and (for some implication signs that we 

consider) . 
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The classification of general sentences into strict and defeasible ones 
will be of superior importance for virtually each of the subsequent chapters. 

General sentences involving the material implication ^ (and an im- 
plicitly stated universal quantification) are of course instances of strict general 
sentences: we call such general sentences ‘universal’. E.g., ‘all penguins are 
birds’ is a universal law, which will be put formally as Penguin[x\ — > Bird[x\. 
OT b a[x] ^ (3[x\ is of course defined to be the case if and only if for all d G 
(J, d) a[a] P[a\. Note that in the case of universal conditionals, there are 
also formulas of the form a[x] {P[x] tH) that are members of but 
there are no nested implication formulas for an implication sign that is different 
from material implication (but, of course, e.g. a[x] ^ {P[x] 7[x]) G C=^). 

Defeasible laws are sometimes ceteris paribus laws of a special sort (for 
a recent overview on and analysis of ceteris paribus laws, including the ones 
that we are going to concentrate on, see Schurz[153], [155]). Instances of de- 
feasible general sentences are e.g. ‘the probability that an a is a /? is high’ or, 
equivalently, ‘(by far the) most as are /?s’, but also ‘as are normally /3s’; we 
call the laws of the first kind ‘high probability laws’ and the laws of the sec- 
ond kind ‘normic laws’ (following Schurz[150], [154], [157], who himself follows 
Scriven[158]); if they are not laws but only law-like sentences, we call them 
‘high probability law-like sentences’ and ‘normic law-like sentences’. E.g., ‘the 
probability that a bird can fly is high among birds’ is a high probability law, 
which might be put formally as Penguin[x] Bird[x]^ whereas ‘normally, 
birds can fly’ is a normic law, which could be stated on the object language 
level by Penguin[x] =>nor Bird[x] (‘^/ip’ and ^^nor are individual constants 
that denote the fixed implication signs for high probability laws and normic 
laws, respectively). We can either translate Penguin[x] =>nor Bird[x\ into the 
metalanguage by ‘normally, birds can fly’, or by ‘normal birds can fly’, or by 
‘birds can normally fly’, which we all take to be synonymous. We hope that 
the considerations of the subsequent chapters support the view that these two 
kinds of laws play a major role in both natural and artiflcal agent cognition. 
We will deal with both kinds of defeasible laws over and over again. At this 
point, we will leave open whether there are high probability conditionals and 
normic conditionals which are also law-like in a more demanding sense accord- 
ing to which they would have to be necessary (in some sense of the word), they 
would have to support counterfactuals, they should be adequate for scientific 
explanations, etc. (indeed, Schurz[154] and [156] defends the law-likeness of 
such conditionals; compare the discussion in chapter 7). 

Apart from the objects in Dact^ let there be some cognitive agent A in 
the sandbox territory. A will be the subject of our epistemological sandtable 
exercises.^ Although A is to inhabit the sandbox territory, we will not count A 
among the members of Dact^ 

is a metalinguistic individual variable ranging over all cognitive agents which meet 
the constraints stated in the following chapters. Thus, the claims that we will make about A 
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Here are the basic capacities which A is assumed to possess: 

• (Assumption on A’s Basic Capacities) 

1. A has perceptual beliefs (this is explained in more detail in chapter 

3) 

2. A has central state beliefs (this is also explained in more detail in 
chapter 3) 

3. A is able to draw inferences (this supposition is discussed in chapter 

4) 

4. A is able to act intentionally; in particular, A is able to move through 
the territory in an intentional way. 

When we say that A has perceptual beliefs, this is meant to entail that 
there is a (normal) causal connection between A’s immediate environment and 
a subset of A’s beliefs (the set of perceptual beliefs), s.t. the latter beliefs are 
about A’s immediate environment. For the sake of simplicity, we presume: 

• (Assumption on A’s Immediate Environment) 

At each time t, the immediate environment around A consists of exactly 
one object 0(t) G Dact- 

O is thus a sequence of objects in Dact, indexed by points of time (‘O’ is 
an individual constant denoting a function, i.e., a certain set-theoretic object). 
A is able to perceive at t that 0{t) has various properties, and she is assumed 
to form a set of perceptual beliefs about 0{t). Here we do not presuppose any 
sophisticated theory of perception: A’s perceptual capabilities are only taken 
for granted, because we intend to give examples having to do with perception, 
since such kinds of examples are both easily grasped and important concerning 
the possible applications of our theory. But perception (as well as memory, or 
knowledge) is among the epistemic capacities that is not within the focus of 
our interest. E.g., we are not going to ask ourselves later, how A’s perceptual 
beliefs are to be justified. 

For reasons similar as in the case of perception, we suppose that A is 
able to move through the sandbox territory. We want A’s perceptual beliefs to 
change from time to time, and - apart from assuming a dynamical environment 
- this is most directly achieved by making A roaming around (thus, it is possible 
that 0{t) ^ 0{t') for some t ^t'). A is assumed to act intentionally, since this 
is part of our usual conception of a cognitive agent, and because, from time to 
time, our examples are going to involve intentional actions. Since A is assumed 
to act intentionally, A is also assumed to have desires. 



are actually universally quantified claims about cognitive agents of a certain type. 
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We assume time to be discrete but open-ended, i.e., 0{t) is defined for 
all t = 0, 1,2, — Because time is regarded as discrete, all temporal variables 
and specifications will refer to natural numbers or zero; this is also the case 
when we speak of “amounts” or “differences” At of time. In order to be able 
to speak about the object which A encounters currently (at t), we will simply 
interpret the individual constant a by the object 0{t). Thus, relative to the 
given sequence O, for each t, {3 act, 0{t)) b (/? if and only if (p is true at time t 
in the immediate environment of A, since 0{t) is the single object within the 
immediate environment of A. E.g., let 0{t) be a member of 3act{Bird), s.t. 
0{t) ^ 3 act {Penguin): in this case, {3act,0{t)) \= Bird{a), but {3act,0{t)) 
Penguin{a). In this case, relative to the given sequence O and at time t, there 
is a bird standing right in front of A, such that this bird is no penguin. By 
means of C we are therefore also able to speak about the object to be perceived 
by A at time t. 

Since we want to say something about ^’s states of belief, we sometimes 
introduce a unary propositional operator B, s.t. for every formula (/p G £, or 
p G C-. U C:^ for some implication signs ^ and ^ that we might consider. 
Bp expresses that A believes that [p is true]. B is - invisibly - indexed by 
a reference to A, and this is also the case for all further doxastic operators 
to be introduced. Note that we use a truth predicate within 'A believes that 
[p is true]’, since the more convenient 'A believes that p^ is not well-formed, 
for V’ being a singular (metalanguage) term^. It might be argued that the 
proposition that is expressed by, e.g., ‘snow is white’, is actually not identical 
to the proposition expressed by “snow is white’ is true’, and thus that 'A 
believes that snow is white’ is not necessarily equivalent to 'A believes that 
‘snow is white’ is true’; in particular, A might be said to have a concept of 
truth in latter case, but not in the former (see Moser[lll], pp.l4f). On the other 
hand, we might also think of ‘ ‘snow is white’ is true’ as a “meaning” -preserving 
translation of ‘snow is white’ in the metalanguage, where object language and 
metalanguage are rigidly fixed, s.t. ‘snow is white’ and ‘ ‘snow is white’ is true’ 
indeed express the same proposition. In any way, we rather trade syntactical 
precision for philosophical precision in this case, since the philosophical point 
is not going to be important for our topic, but the syntactical point helps to 
avoid a mixing up of the object- and the metalanguage level. 

We will usually add squared brackets to sentences about A’s having 
beliefs (like in ‘A believes that [p is true]’) just to get around any possible 
ambiguities. In such a case, the content of the belief is expressed by the sen- 
tence which is enclosed by the brackets. But this is only done in order to 
disambiguate our notation - in particular, ^[p is true]’ is not a singular term 
denoting a proposition as it is sometimes the case in the literature. Beside the 

^Just as the string Bx is not a well-formed formula of the object language, since B may 
only be applied to formulas and not to singular terms. 
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belief ascription on the object language level by means of formulas like Bip^ we 
also use belief ascriptions on the metalinguistic level by phrases like believes 
that [p is true]’. Often we will use expressions in the metalanguage that do not 
have counterparts in one of the object languages that we consider; in such case, 
the metalinguistic expression is relevant for the motivation and explanation of 
our theory (of inference, of justified inference, etc.), but it does not occur in the 
theory itself. We introduce the object language idioms that involve B just to 
enable a smooth transition from the theory of inference and justified inferences 
that is developed in the parts I and II to the more logic-oriented parts III and 
IV. 

Before we can say how we are going to ascribe beliefs to A by means of 
the belief operator, we have to add some assumptions on our agent. We adopt 
a system-theoretic view on A^ i.e., we think of A as a system of parameters the 
values of which may change in time: 

• (System Assumption on A) 

A is a system of parameters evolving in discrete time steps t = 0, 1, 2, . . . 

The discreteness of time is assumed just for simplicity. All of what 
follows might have also been developed for systems evolving in continuous time, 
e.g., for systems defined by differential equations. But discrete time makes 
things a lot easier. 

The system assumption on A is actually close to a “specification” of A 
which is satisfied trivially, since virtually every cognitive architecture may be 
regarded as a system. Let us have a look at two representative examples: 

Example 6 

1. (In terms of computers) 

Our standard computer systems are electronic systems and as such they 
are extremely complex systems of physical parameters. But - at least as 
long as they function properly - they may also be viewed as instantia- 
tions of finite automata or Turing machines, if we neglect the fact that 
Turing machines are equipped with an infinite tape. A Turing machine is 
a system of the following parameters: (i) a countable infinite set of pa- 
rameters the values of which (i.e., the tape entries) are members of a set 
{ao, . . . , an} of tape symbols, (ii) a parameter the value of which (usually 
called the internal state’) is a member of a set {go, . . . ,qm}, (Hi) a pa- 
rameter denoting the position of the reading head, and (iv) a parameter, 
or a set of parameters, the values of which are strings of words (i.e., the 
program). 
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2. (In terms of connectionist networks) 

Connectionist networks are systems of the following parameters: (i) a 
set of parameters the values of which are the activities of nodes; these 
parameters may be binary or real-valued; (ii) a set of parameters the 
values of which are the weights of edges; these parameters may also be 
binary, but usually they are real-valued. 

A parameter-setting is an assignment of values to the parameters of 
the system, where a parameter might be considered as some kind of linguistic 
entity. We use ‘s’ (without or without indices) as variable ranging over the 
possible parameter-settings for A. If A is, e.g., a computer system, then the 
dynamics of A, i.e., the set of possible trajectories of parameter-settings, is 
governed by a program; but if A is a human or animal brain, then it might 
be the case that we can only describe A’s dynamics by some difference- or 
differential equation. Anyway, whether A is at t in some cognitive state or not, 
is assumed to be completely determined by the parameter-setting of A at In 
the first chapter of the appendix we state this system-theoretic account of A 
in more detail, and in chapter 3 we will specify the subsystems that we assume 
the system A to consist of, i.e., a perceptual subsystem, a central subsystem, 
and an action subsystem. 1 1 

Now we can introduce our notation for belief ascriptions: 

Let be the set of formulas of the form B(f where e C or (p E 
U for some implication signs and => that we might consider (thus 
we use £, also as languages in which we can express the contents of 

beliefs); let s be one of the possible parameter-settings of A. We extend the 
satisfaction relation t= in the following way: 

• (Notation for Ascribing Beliefs to A) 

s t= Bp if and only if A believes in s that [p is true]. 

If, e.g., s 1= B{Bird{a)), we will understand this in the way that A 
believes in s that the object that is within her immediate environment is a 
bird.** 



II When we say that our cognitive agent A is a system of parameters, we do so according 
to the same manner of talking according to which a physicist might say that the the sun 
and planets taken together are a dynamical system. More appropriately, we should say: real 
world objects (the sun, the planets, our cognitive agent A) instantiate concrete systems (the 
solar system, the cognitive system of A) that consist of variables which are features of the 
real world and which change in real time in accordance with natural laws; concrete systems 
realize abstract systems that are sets of abstract variables goverened by mathematical rules 
or laws, where the “time” variable ‘t’ ranges over natural or real numbers. In the following 
we will simplify matters by identifying the cognitive agent A with an abstract system of 
parameters that is realized by a concrete cognitive system instantiated by A (for more on 
this see Van Gelder[173], pp.616f). 

**For our purposes, the question of whether we understand such phrases in a referentially 
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Sometimes we will parametrize s by a temporal parameter we will 
refer to sequences (^(^))tein parameter-settings, where In is an index set of 
successive points of time, and we will regard s{t) as the parameter-setting of 
A at time t. Whenever we speak of a possible system trajectory or cognitive 
history” {s{t))^^J^^ (^(0)te/n sequence of parameter-settings, 

but rather one that corresponds to the dynamics of A, i.e., one that obeys the 
system laws of A. If we say that s{t) N Bip, then s{t) is the parameter-setting of 
A at time t, and A believes at t (relative to {s{t))^^j^) that [(p is true]. We will 
sometimes just say ‘A believes at i that [(p is true]’ in the context of an explicitly 
or implicitly given sequence {s{t))^^j^ and suppress any further reference to 
{s{t))^^j^ or s{t). Note that a trajectory {s{t))^^J^ is not necessarily realized 
actually by A, it is just a sequence of parameter settings for the system A which 
could be realized. Thus, ‘system trajectory’ and ‘possible system trajectory’ are 
synonymous. 

Whether s{t) 1= B(p for G £ or not, is generally not dependent on the 
value of O at t, and thus there is in general also no way of determining whether 
s{t) N Bip or not, just by considering the values of 3 act or of O (at t). Note 
that by means of the sentences of C we can only express A’s beliefs concerning 
the state of the immediate environment around A at time t. On the basis of 
U we are able to ascribe general beliefs to A. 

When we speak of the belief of A that [p is true] (or, synonymously: 
the belief of A in the truth of or A's belief that [(p is true]), we actually use 
an operator b on the metalanguage level, s.t. bp) is a singular term denoting the 
very mental state in which A believes that \(p is true] . Wherever it is necessary, 
we will introduce such an operator b explicitly on the object language level 
(i.e., we will extend to a language Cs.b by adding all expressions of the 
form b(p^ where (p £ C or p> e U for some implication signs and 
that we might consider). E.g., employing the binary predicate Causes^ which 
is to express the causality relation between belief states of .A, we can say: 

let {s{t))^^j^ be a possible trajectory of A; 

• (Notation for Causation of A’s Beliefs by A’s Beliefs) 

t, (s(t))^^j^ N Causes{b(p,btlj) if and only if A’s belief that [(p is true] 
causes at t (relative to {s{t))^^j^) A’s belief that [-?/; is true]. 

Causation sentences like Causes {b(p^btf) are not satisfied by single 
parameter-settings, but by a point of time and a trajectory of parameter- 
settings, since A’s “cognitive history” before t is relevant for their satisfaction: 
let us assume that there is fixed constant period Ate > 0 of time, s.t. when we 
say that A’s belief that [ip is true] causes at t (relative to {s{t))^^j^) A’s belief 

opaque or in a referentially transparent way will not be important at all. But since a is actually 
used £is a definite description here and not as a proper name, the referentially opaque reading 
would be the more appropriate one. 
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that [V^ is true], we mean that (i) A believes in s(t — Ate) (he., at the time 
t — Ate before t) that [(p is true], and (ii) the system A is disposed in s{t — Ate) 
to change (after the amount Ate of time) to the belief that [ 7 /; is true] given the 
belief that [p is true] . 

‘changing to’ only refers to a state-transition and is much weaker than 
‘causing’ since it does not entail some dispositional state; ‘being disposed to 
change to’ is defined precisely in def.187 in chapter 21, but it should be clear 
informally what is meant. Note that if a belief causes another belief at t (relative 
to (s(t))tg/„), then we do not presuppose that the latter belief has not been 
held by the agent immediately before t. ‘Ate’ is a metalinguistic individual 
constant that denotes the time that it takes ^’s belief that [p is true] to cause 
^’s belief that [ip is true] (for variable amounts of time we rather would use 
the metalinguistic individual variable ‘At’) - thus we actually only employ the 
predicate Causes to express causation in the fixed duration Ate of time. When 
we define inference in chapter 4, we will use Ate as the fixed amount of time 
that it takes until a premise belief causes a conclusion belief in a direct, i.e., 
elementary, inference step. Indirect, i.e., complex, inferences involving more 
than one direct inference step will take amounts of time that are multiples 
of Ate- This is of course a crass simplification compared to real-world agents, 
but it is going to make our theory of inference much more transparent. As we 
will see, the explication of the notion of inference is complicated enough even 
though such a simplification on the temporal macroperspective is presupposed. 
The constant amount Ate of time is used for all kinds of direct causation 
(including, as we will see later, direct causal sustaining) that are relevant for 
our theory. Indirect causation can be defined by means of direct causation, but 
we will not express indirect causation on the object language level. Note that 
Ate is actually tacitly relativized to the agent (or system) A. 

The notion of causation that we employ is weaker than most of the 
notions of causations that are currently in use. In particular, our notion lacks 
Mackie’s[98] well-known inus condition. This has e.g. the consequence that the 
network agents which we turn to in part IV infer logically true formulas from 
arbitrary premises. If this was to be avoided, a more complicated account of 
causation and inference would have to be developed, s.t. certain relevance con- 
ditions are incorporated. However, since causation is notoriously problematic 
anyway, we prefer the simplifying identification of causes with sufficient condi- 
tions on a dispositional basis. 

Finally, we add another assumption concerning the semantics of 
Causes {bp ^b'l/j): when we say that the system A is disposed in s{t — Ate) to 
change (after the amount Ate of time) to the belief that [ip is true] given the 
belief that [p is true], we also presume that (iii) the perceptual input to A is 
constant for the amount Ate of time, i.e.: A is disposed in s{t — Ate) to change 
(after the amount Ate of time) to the belief that [ip is true] given the belief that 
[p is true] , and given that the perceptual input to A is constant for the amount 
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Ate of time (i.e., between t — Ate and t). So when we say that a belief causes 
another one we tacitly mean that this is the case given that the perceptual 
input is left invariant. 

Summing up we can define causation in our context as follows: 

Definition 7 (Causation of A ’s Beliefs by A’s Beliefs) 
t, {s{t))^^j^ N Causesibip^b'ip) iff 

1, s{t — Ate) ^ 

2. A is disposed in s{t — Ate) to change (after the amount Ate of time) to 
the belief that ['ip is true] given the belief that [(p is true] 

( and given that the perceptual input to A is constant for the amount Ate 
of time). 



Sometimes we neither ascribe some specific belief to A (by means of 
Bif) nor refer to a specific belief of A (by means of b(f)^ but we speak of ‘a 
belief of A’: in such a case we use a predicate on the metalanguage level, s.t. 
the extension of the predicate is the set of A’s beliefs. However, we will not 
make use of such a predicate on the object language level. 

In the next chapter we will discuss the beliefs of A much more thor- 
oughly, but we are not going to define what it means to say that A believes 
that [(f is true] - we will only state some necessary conditions for A^s having 
such a belief. In chapter 21 we will deal with some further properties of A’s 
belief states. 

We add the following important - and rather realistic - assumption on 
A’s beliefs: 

• (Assumption of Belief Incompleteness for A) 

A’s set of beliefs, in particular A’s set of perceptual beliefs, is incomplete. 

By ‘incomplete’ we mean that for all times t, and for “most” formulas 
(f e C or (f e (for some and =^): neither s{t) N B(p nor s{t) N B~^(p. 

E.g., by perception, A may believe at t that there is a bird in front of her, i.e., 
s{t) N B{Bird{a)), where a is interpreted by 0{t). But, at the same time, it 
may be the case that neither s{t) \= B{CanFly{a)) nor s{t) N B{->CanFly{a)), 
because the bird 0{t) is not moving around at t but just picking grains from 
the ground, s.t. A is not able to determine by perception whether the bird can 
fiy or not. 

Let us state this in terms of a little story which we are going to refer 
to in each of the subsequent parts: 
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Example 8 (The Cat&Bird Example) 

Let our cognitive agent A he a cat. A has the perceptual belief that 
there is a bird right in front of her, and this is all that she believes perceptually. 
Suppose that A is hungry and thus wants to catch the bird. If the bird is not 
able to fly, A may simply move towards the bird, and since the bird is, say, not 
a very fast runner, A will be able to catch the bird. But if the bird is indeed able 
to fly, then A will have to choose a totally different kind of strategy. The way 
in which A finally decides to hunt the bird should therefore crucially depend 
on what A believes about the flying capabilities of 0{t). Perception does not 
offer any clue in this case: therefore A will have to draw an inference, i.e., to 
generate a plausible hypothesis, concerning whether (3act^O{t)) N CanFly{a) 
or not. In this example, or so we are told by our intuitions, A is justified to 
infer that the bird can fly, since A believes that there is a bird in front of her, 
and - by stipulation - A does not have any other relevant belief (e.g., a belief 
undermining this inference, or contradicting the conclusion); thus, lacking any 
relevant counterevidence, it is plausible to assume for A that the bird can fly, 
and if A draws this inference, the inference will be justified. But if A also had 
the perceptual belief that the bird in front of her was actually a penguin, or has 
a broken wing, then A would not infer justifiedly that the bird can fly. 

In the light of the last example, the question which we are after in the 
first part (part I) of our investigation, is: 

what kinds of processes are such inferences? 

In the second part (part II) we will ask ourselves: 

how can such kinds of processes be justified! 

The third part (part III) is devoted to the question: 
what are the logical rules that govern such processes if the latter are justified? 

Finally, in the fourth part (part IV), we will deal with the following 
question: 

what do ideal agents look like that only draw such justified inferences? 

Before we can deal with the answers to these questions, we have to 
clarify in more detail what we mean by saying that A has beliefs: this will be 
done in the next chapter. The explication of the notion of belief will be given on 
a more or less intuitive level, in the way that it is usually done in philosophy of 
mind or in epistemology. On this level, terms like ‘state’, ‘disposition’, ‘causes’, 
‘process’ and so on will be left undefined. Moreover, our explication of the 
notion of belief will only consist of a list of necessary conditions, i.e., we will 
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not really define what a belief is. Given this rough characterization of beliefs, 
we will be able to define inference in chapter 4 rather thoroughly (but by 
making use of the undefined terms again). In chapter 21 we have tried to put 
the concepts of belief and inference more precisely by defining the complex 
terms which had been left undefined. What will be left undefined, however, 
is in which way the content of a belief corresponds to the role that the belief 
has in the agent’s cognition. The explication of the notions of monotonic and 
nonmonotonic inference, and, more specifically, of deductive, high probability, 
and normic inference, will be part of our explication of the notion of inference 
in chapter 4. Furthermore, we will derive some basic properties of beliefs and 
inferences in the subsequent chapters of part I, we will add some assumptions 
involving our sandbox agent A, and we will try to motivate these assumptions 
in a way that shows that the assumptions do not trivialize the class of cognitive 
agents that we consider. In the next chapter on beliefs, we are going to separate 
general claims about beliefs from specific claims about the beliefs of the agent 
A that we are concentrating on. In chapter 4 on inference, we will formulate all 
explications and definitions just for our sandbox agent A, and not for agents 
in general, because it would be difficult to state a theory of inference without 
several limiting assumptions concerning the class of agents that are considered. 
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Chapter 3 
BELIEF 



This chapter contains a discussion of some of the properties of beliefs, an outline 
of what kinds of belief there are, and a list of what assumptions we adopt 
concerning the beliefs of our agent A. Each of the concepts introduced in this 
chapter will be put to use in the subsequent chapter on inferences, and in parts 
II, III, and IV. Some of the formal issues are developed in a more detailed 
manner in chapter 21. 

First of all, we start with the usual folk-psychological explication of 
the concept of belief, which we take to comprise (at least) the following claims: 

• (Some Explicatory Properties of Belief) 

1. Beliefs are mental states of a cognitive agent which are intentional, 
i.e., which have some content 

2. the contents of beliefs are propositions, i.e., abstract entities which 
may be expressed by sentences* 

3. the content of a belief partially corresponds to the way in which the 
belief has been acquired by the agent, and to the way in which it 
may be revised by the agent 

4. beliefs have an internal “structure” , which somehow mirrors the in- 
ternal structure of the proposition that is the content of the belief, 
and/or that mirrors the syntactical structure of the sentence that 
expresses the content of the belief 

5. beliefs may be ascribed to an agent by means of a propositional belief 
operator (we have used the operator B for exactly this purpose in 
the last chapter)^ 



*This supposition is made with a grain of salt: as Quine[128] has convincingly argued, (i) 
propositions lack clear identity criteria, and (ii) propositions are the ontological burden of an 
oversimplified conception of language and attitudes. However, since we are not going to be 
confronted with those problems in the following, we decide to put on our “pink philosophical 
glasses” and to accept propositions. 

^We do so just to stick closely to the way of ascribing beliefs that is usual in epistemic logic. 
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6. beliefs have an action-guiding, or, more generally, an activity-guiding 
function for the agent: “they are mental states apt for selective be- 
haviour towards the environment” (Armstrong’s [8], p.339). This is 
sometimes also expressed by saying that actions are the typical ef- 
fects or manifestations of beliefs, where, if the agent believes that 
[ip is true], she is guided to act in a manner that is “appropriate” 
to the truth of (p (given an interpretation for ip like in our case the 
intended interpretation mapping 3act)- E.g., if an adult human be- 
ing believes that [ip is true], and if she is asked whether p is true, 
she will answer the question in the normal case “appropriately” , i.e., 
affirmatively. But this kind of linguistic behaviour is not assumed to 
be necessary of the agent’s believing that [p is true]; e.g., the agent 
might be incapable of communication. 

Property 6 motivates the following assumption: 

• (A Further Assumption of A’s Beliefs) 

Beliefs are states of the whole system A, not of one of her subsystems: e.g., 
also perceptual beliefs have by 6 an action-guiding function and therefore 
cannot be regarded as states of the perceptual subsystem of A, since the 
latter is not able to initiate actions without the action subsystem, or 
without a central subsystem that connects the perceptual subsystem to 
the action subsystem. 

Furthermore, we add the following constraints on how to understand 
the term ‘belief’ to our list of explicative properties: 

• (What We Additionally Mean by ‘Belief of A’) 

1. Belief states are always considered to be state types^ not state tokens; 
e.g., A may be in one and the same state of belief at different times. 
If we say that s{t) N Bp, this actually means that if s{t) is the 
parameter-setting of the system A at t, then A is at t in the belief 
state type that [p is true], or, put differently, A’s cognitive state 
token at t is of the belief state type that [p is true]. Moreover, we 
employ a state type causation terminology, by which a state type 
may be said to cause another state type 

But again this is done cum grano sails, because operators do not allow for quantification over 
the contents of beliefs on the object language level (at least, as long as we restrict ourselves 
to usual modal means). E.g., we cannot say: s |= since the latter formula is not 

well- formed according to the usual modal- logical languages. This is regrettable, and could 
e.g. be avoided by choosing a belief predicate in the style of Carnap and Quine, instead of 
an operator. But we will nevertheless conform to the operator approach for more or less 
conventional reasons, and express truths like the one above on the meta-linguistic level, by 
saying: there is a formula </?, s.t. s |= B(f. 
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2. beliefs are “all-or-nothing” states of acceptance, i.e., we neglect any 
approach employing degrees of belief. This is more or less for simplic- 
ity, but there is also a conceptual point to be made here which will 
be outlined in more detail in chapter 7: (i) “qualitative” beliefs are 
less complex cognitive states than beliefs with degrees; thus (ii) the 
cognition of low-level agents perhaps has to be based to some extent 
on all-or-nothing belief states for the sake of complexity reduction; 
(iii) as we have emphasized in the introduction, we are primarily 
interested in low-level agents 

3. belief contents are not necessarily consciously entertained, or thought 
of, by A] e.g., many human beings believe that water is wet, and they 
do so virtually for their whole life, but this is not always (indeed, 
rarely) thought of, or consciously entertained by its believers 

4. we regard the cognitive states of A as sets of parameter-settings of 
A] A is in a parameter-setting in a certain state if and only if the 
parameter-setting is a member of the state. Since beliefs are also 
cognitive states, beliefs are sets of parameter-settings, too, and being 
in a belief is equivalent to the current parameter-setting’s being a 
member of the belief. We will use corresponding definitions for all 
the different kinds of beliefs that we are going to discuss below. Note 
that we follow here, more or less, a tradition in the philosophy of 
mind which regards states as properties of agents, whereas in system- 
theory it is more usual to refer to the parameter-settings themselves 
as ‘states’. 

If A has the property that in every parameter-setting in which A be- 
lieves that [(p is true], she also believes that is true], and vice versa, then 
item 4 implies: A’s belief that [(f is true], i.e., the set of parameter-settings in 
which A believes that [(p is true], is identical to ^’s belief that is true], i.e., 
to the set of parameter-settings in which A believes that {ij) is true]. Thus, it 
is a consequence of our conception of belief states that A’s belief state that [(p 
is true] (equivalently: that is true]) would in such a case not only have the 
proposition that [p is true] as its content, but also the proposition that is 
true]. Therefore, the contents of A’s belief states are not necessarily uniquely 
determined. ‘>l’s belief that [p is true]’ and ‘A’s belief that [i/; is true]’ are 
different singular terms that refer in our example to the same (belief) state of 
A. 



Our explicative list of properties of beliefs is not meant to be complete 
with respect to our commonsense notion of belief, but it only contains some 
important necessary conditions. Note that neither animal beliefs have been 
excluded, nor the existence of computer systems having beliefs; put differently: 
the existence of beliefs of the latter kinds is no matter of a priori analysis but 
rather a question of empirical research. 
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In the following, we do of course not claim that A is in principle capable 
of having the belief that [(p is true] for every sentence in C or some other 
language. E.g., it might be the case that A is incapable of having beliefs the 
contents of which are expressed by negated formulas, or by disjunction formulas, 
or formulas of another type. What we have claimed above, and what we are 
going to claim in the subsequent sections is just that if A has the belief that 
[ip is true] for some (^, then this belief has such-and-such properties. 

Let us distinguish between the following kinds of beliefs: 

1. first of all, we distinguish perceptual and central state beliefs: the for- 
mer only depend on the agent’s perceptual system, the latter on the 
agent’s central system. This is highlighted in section 3.1. The distinction 
of perceptual beliefs and central state beliefs is heuristically useful in the 
context of a study of inference: inferences lead from premises to conclu- 
sions, s.t. the premises are themselves are not revised in the course of the 
inference; this is nicely illustrated by the agent’s inferring from “given” 
externally determined perceptual beliefs to revisable central state beliefs 
that are determined both externally and internally. This model is in fact 
unrealistic as far as natural agent are concerned, since there is no percep- 
tual system in human or animal brains that would only be determined 
externally; there are only cognitive subsystems that are determined more 
externally than others. But let us accept this discrepancy for the sake 
of our exposition of inferences, although it already commits us tacitly 
to an epistemological point of view which allows at best for a strongly 
restricted form of a coherence theory of justification. This orientation to- 
wards a weak empiricism and foundationalism would be modifiable and 
is thus not a necessary part of our approach but just a convenient choice. 

2. At various places we will refer to the concept of a total belief, which will 
also be essential for our definition of the notion of nonmonotonic inference 
in chapter 4. What is meant by ‘total belief’ is explained in section 3.2. 

3. Moreover, beliefs are sometimes (vaguely, but usefully) qualified as being 
occurrent, or as being dispositional. 

Occurrent beliefs are “causally active” states of the agent; normally, they 
change very quickly. Dispositional beliefs are dispositions of the agent, 
and they normally change slowly. There are two analogies which might 
help to explain the difference between occurrent beliefs and dispositional 
beliefs. The first is in terms of arguments of functions and functions them- 
selves: dispositional beliefs are analogous to functions, whereas occurrent 
beliefs are analogous to the arguments of functions, or the images of ar- 
guments under functions. This intuition is - in a sense - made precise 
in the first chapter of the appendix, and it is also used in section 3.3. 
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The second analogy is in terms of formulas and rules of inference: dispo- 
sitional beliefs correspond to rules of inference, whereas occurrent beliefs 
correspond to formulas which may be used as premises to which the rules 
are applied, or to formulas which may be used as the conclusions that are 
the results of the application of the rules. This picture is taken up in the 
next chapter on inferences. 

We regard the class of dispositions as a sub-species of the class of states 
(in the line of Armstrong[8], [9]) of an object, s.t. if an object is in such a 
state it has the tendency to behave in a certain way under appropriate cir- 
cumstances (To behave’ is to be understood here as broadly as possible). 
The object’s “behaving” under the appropriate circumstances is called 
the manifestation of the disposition. If the appropriate circumstances are 
actually the case, we say that the disposition is activated or triggered, and 
thus, active. The activation of a disposition is different from its causation 
by another state or event: in the first case, the disposition has already 
been there, but not so in the second case. If the circumstances are not ap- 
propriate, we say that the disposition is inactive. The manifestation of a 
disposition is caused by the circumstances; sometimes this is put the way 
that the manifestation of a disposition is caused by the circumstances and 
the disposition itself (or the fact of its being present), but this manner of 
talking blurs the distinction between occurrent and dispositional states, 
and the original idea would turn to be rather obsolete. 

We consider the question of whether a belief is occurrent or dispositional 
as a matter of how the content of the belief is related to the agent, 
i.e., a proposition may be believed occurrently by A, or dispositionally; 
correspondingly, a belief may be an occurrent belief or a dispositional 
belief. Furthermore, we will argue that a belief b may even be, at the same 
time, a substate of an occurrent belief and of a dispositional belief where 
the occurrent belief and the dispositional belief have the same content, 
and where substates are defined in the way that one state is a substate of 
another if and only if, whenever the agent is in the latter state, she is also 
in the former (and we say: if one state is a substate of another, the latter is 
a superstate of the former). In such a case, the belief b is neither occurrent, 
nor dispositional, but it is the superposition of an occurrent belief and a 
dispositional belief. Thus, a single proposition might even be believed both 
occurrently and dispositionally by A, although - qua states - occurrent 
beliefs and dispositional beliefs are usually not identical, even if they 
have the same content. This account of occurrent beliefs and dispositional 
beliefs is discussed in more detail in section 3.3. 

We omit any discussion of memory beliefs, and thus we do not have to 
distinguish between purely dispositional beliefs and memory-dispositional 
beliefs. 
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4. Finally, we introduce the classification of beliefs as either being singular, 
or being general. 

Whether a belief is singular or general depends on its content, i.e., whether 
its content may only be expressed by a general sentence or not. A belief 
is either singular, or general, but not both of them at the same time. This 
distinction is dealt with in section 3.4. 

A’s having general, dispositional beliefs will be shown to be a necessary 
pre-requisite for A’s drawing inferences in chapter 4. 



3.1 Perceptual Beliefs and Central State Beliefs 

Perceptual beliefs are the beliefs about the current state of the environment, 
and they are (normally) caused by external stimulation. We understand ‘central 
state belief’ in the way that A may at the same time perceptually believe that 
[if is true], but centrally, i.e., internally and non-perceptually, not believe that 
[ip is true]: e.g., I may perceptually believe that my girl-friend has just entered 
the coffee house, but I still do not centrally believe that she has just entered the 
coffee house, because I believe that she is on a journey in Australia (actually, 
she is not, and she has really just entered the coffee house). This is in line with 
Goldman’s understanding of perceptual beliefs: . .1 interpret a percept as a 

belief, a perceptual system’s judgment about environmental objects . . . Such a 
belief can be rejected by the central system. This is just what occurs when vision 
‘says’ that the stick immersed in water is bent, and the central system ‘says’ 
that it is straight. Admittedly, calling perceptual outputs ‘beliefs’ diverges from 
ordinary usage; but it is theoretically fruitful. It makes sense of the intuitively 
compelling feeling that the central system disagrees with the visual system 
when it judges the stick to be straight. Disagreement is best construed as 
conflict in belief.” (Goldman[69], p.l97f). Thus we distinguish between A’s 
perceptual beliefs and A’s central state beliefs. The former usually have, of 
course, a strong causal influence on the latter. The latter entail expectations 
concerning the former. Percepts as entities different from beliefs are, in the 
following, not considered at alff . The distinction between perceptual beliefs and 
central state beliefs will prove to be convenient in all subsequent chapters and 
reflects an assumption that we have already referred to on p. 16, and which we 
will state more precisely in chapter 21, namely, that A consists of subsystems: 
the perceptual system, the central system, and an action system (the latter 
being not so relevant in our context). Any parameter-setting of A includes 
the parameter-setting of A’s perceptual system and the parameter-setting of 
A’s central system. A perceptual state is a state of A’s perceptual subsystem 

^ Goldman [69], p.l85, even suggests to identify percepts and perceptual beliefs: “It is 
congenial to this view, then, to construe a percept as a belief a distinctive kind of belief, 
which we might call a ‘perceptual belief’.” 
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and it may be defined as a set of parameter-settings of the perceptual system. 
The perceptual system is in a perceptual state if and only if its parameter- 
setting is a member of the state. Correspondingly, a central state is a state 
of A’s central subsystem and it may be defined as a set of parameter-settings 
of the central system. The central system is in a central state if and only if 
its parameter-setting is a member of the state. Whether A has or has not a 
certain perceptual belief at a time t is assumed to be independent of whether 
A has a certain central state belief at the same time, or of what the current 
parameter-setting of the action system looks like; it is assumed to depend only 
on the parameter-setting of the perceptual subsystem. Accordingly, whether A 
has a certain central belief depends only on the parameter-setting of A’s central 
system. But since perceptual beliefs and central state beliefs are beliefs^ they 
are not states of some of the subsystems of A, but of the system A herself (recall 
the “further assumption on A’s beliefs” stated on p.26). Let us formulate this 
at first for perceptual beliefs by means of the following assumption: 

• (Assumption on A’s Perceptual Beliefs) 

For every perceptual belief of A there is a unique perceptual state, s.t. A 
has the former perceptual belief if and only if her perceptual system is in 
the latter perceptual state. 

Perceptual beliefs are thus the counterpart states of perceptual states 
on the level of the whole system, but not perceptual states themselves. 

The analogous assumption for A’s central state beliefs is superfluous, 
since it is going to follow from further (but different) assumptions that we will 
make on central state beliefs in section 3.3. 

If we explicitly want to ascribe perceptual beliefs to A, we will use the 
operator B^ that we add to our vocabulary: 

• (Notation for Ascribing Perceptual Beliefs to A) 

5 N iff A perceptually believes in s that [(p is true]. 

Perceptual beliefs will be denoted analogously as beliefs in general: we 
just employ an operator If instead of b. CBP,bp is the corresponding language 
used for belief ascription. We have already used - and we will subsequently 
often use - the predicate ‘a perceptual belief of A’ the extension of which is 
the set of perceptual beliefs of A; but we do not introduce a corresponding 
predicate on the object language level. 

In the same way, we apply B^ and b^ (and the language for the 

case of central state beliefs, and we use the predicate ‘a central state belief of 
A’ on the metalanguage level. 

So we have: 

• (Notation for Ascribing Central State Beliefs to A) 
s N B^p) iff A centrally believes in s that [p is true]. 
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3.2 Total Beliefs 

Sometimes we do not just want to say that A believes that is true, but even 
that all that A believes is that [(fi is true], or, synonymously, A only believes 
that (f is true. This manner of speaking is, e.g., presupposed when we will deal 
with nonmonotonic inferences in chapter 4. 

The notion of ‘a// that A believes is that . . may be put formally by 
employing an all-that-A-believes operator AB (in the style of Levesque [93]): 

Let Cab be the set of formulas of the form AB(f^ where (f e C or 
(f G C^ U C=^ for some ^ and => we might employ; let 5 be a parameter- 
setting of A. We extend the satisfaction relation 1= in the following way: 

• (Notation for Total Belief Ascription to A) 

s 1= AB(f if and only if all that A believes in s is that [(p is true]. 

If we explicitly want to ascribe a total perceptual belief to A, then we 
use the operator AB^ (and the language Cabp)] for ascribing total central state 
beliefs we use AB^ (and the language Cab^)‘ 

• (Notations for Ascribing Total Perceptual/Central State Beliefs to A) 

1. s t= AB^p iff all that A perceptually believes in s is that [p is true] 

2. 5 1= AB^p iff all that A centrally believes in s is that [p is true]. 

E.g., A might perceptually believe at t that there is a bird in front of 
her, s.t. this is the only perceptual belief she has at t, since for some reason the 
black-colouredness of the bird and its cheeping sounds did not manage to turn 
into perceptual beliefs. In this case we would have that s{t) N AB^ {Bird{a)) . 

The concept of a total belief is the reified version of the all-that-A- 
believes terminology, i.e., instead of ascribing a state to A, we refer to a state 
of A^: A has the total belief that [p is true] in the parameter-setting s if and 
only if 5 N ABp. Here we actually introduce an operator tb on the metalanguage 
level, s.t. tbp is a singular term denoting the very mental state in which all that 
A believes is that [p is true]. Accordingly, we introduce the operator tb on the 
object language level (extending to a language CAB,tb)i we say: 

let {s(t))^^j^ be a possible trajectory of A; 

• (Notation for Causation of A’s Beliefs by A’s Total Beliefs) 

t, {s{t))^^j^ N Causes{tbp^b'ilj) if and only if A’s total belief that [p is 
true] causes at t (relative to {s{t))^^j^) A’s belief that [ip is true]. 

§Horgan&Tienson[82] use the notion of a total cognitive state, which is analogous to that 
of a total belief, except for being more general by including also non-doxastic states. 
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Similar to the sentences that express causation by beliefs states sim- 
pliciter, also causation sentences like Causes{tb(p, bt/j) are not satisfied by sin- 
gle parameter-settings, but by a point of time and a trajectory of parameter- 
settings, since A's “cognitive history” before t is again relevant. 

We use the same fixed period Ate > 0 of time as in the case of causation 
by beliefs, and we say that A’’s total belief that [ip is true] causes at t (relative 
to A’s belief that {ip is true] if and only if: (i) all that A believes in 

s{t — Ate) (be., at the time t — Ate before t) is that [(p is true], and (ii) the 
system A is disposed in s{t — Ate) to change (after the amount At of time) to 
the belief that [ip is true] given the total belief that [(p is true] and (iii) given 
that the perceptual input to A is constant for the amount Ate of time. 

‘changing to’ again only refers to a state-transition and is weaker than 
‘causing’ since it does not entail some dispositional state (‘being disposed to 
change to’ is defined in def.187 in chapter 21). Just as in the case of causation 
by beliefs, we will use the predicate Causes only to express causation within 
one direct (or immediate) inference step - just because it is going to simplify 
various issues concerning inference. If a total belief causes another belief at t 
(relative to {s{t))^^j^)^ then it is not presupposed that the latter belief has not 
been held by the agent immediately before t. 

So we get: 

Definition 9 (Causation of A ’s Beliefs by A^s Total Beliefs) 
is{t))tein ^ Causes{tb(fi,bij) iff 

1. s{t — Ate) 1 = AB(p 

2. A is disposed in s{t — Ate) to change (after the amount Ate of time) to 
the belief that [ip is true] given the total belief that [ip is true] 

( and given that the perceptual input to A is constant for the amount Ate 
of time). 

If we explicitly want to speak of a total perceptual belief, we use the 
operator tbP (and the language CABp,thp)\ analogously for total central state 
beliefs and the operator tb^ (and the language CAB<^,tb^)- 

We can explicate the ascription of a total belief by using the notion of 
substates of beliefs (recall that one belief is a substate of the other, if the latter 
is a subset of the former, i.e., whenever A has the latter belief she also has the 
former) : 

for all (/? G £ U U (for some implication signs and =^); 

Definition 10 (Explication of Total Belief Ascription to A) 

s 1= ABip, i.e., A has in s the total belief that ](p is true] iff 



i. s N B(p 
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2. for every ^0 G £; z/s N then A’s belief that /0 is true] is a substate 
of A ’s belief that [ip is true]. 

• total belief that [p is true] is the set of all parameter-settings s, s.t. 
s N ABp. 

Thus, if A has in s the total belief that [p is true], then she believes 
in s that [p is true], and every further belief that she has in s is actually a 
substate of the belief that [p is true], and thus a superset of the belief that [p 
is true]. E.g., it might be the case that s N ABp and s N B{p\/ 'ip), since ^’s 
belief that [pV xp is true] might be a substate of A's belief that [p is true]. 

Instead of ds a substate of’ in def.lO we might also say ds included 
in’, or ds contained in’, or, maybe, ds entailed by’. A total belief of A is any 
such total belief of A that [p is true] (‘a total belief of is a metalinguistic 
predicate, which will not be formalized in the object language). Just as we 
speak of a total belief of A, we can speak of a total perceptual belief of A, and 
a total central state belief oi A. 

Note that A^s total belief that [p is true] is generally different from 
^’s belief that [p is true]. In this sense, taken precisely, total belief states 
are generally no belief states, but they constitute a different class of mental 
states than the class of belief states. The property of being total is independent 
of whether a belief is occurrent or dispositional, singular or general (see the 
subsequent sections). 

Let us consider two ways of how such a concept of belief inclusion might 
be defined without regarding beliefs as sets, and without using the notion of a 
substate as being defined as a superset. We concentrate solely on the case that 
p ^ j2’. 

First of all, we might assume that A has only got beliefs the contents 
of which may be expressed by finite conjunctions of atomic formulas in C. 
Given this assumption, we can define that s N AB{ai A ... A an) iff (i) s t= 
B{ai A ... A an), and (ii) if s 1= B{j3i A ... A f3m) for atomic /3i, . . . , (3m, then 
{/?i, . . . , (5m} ^ {ai, . . . , an}; in the case of (ii) we might say that A's belief 
that [(3i A ... A (5m is true] is included in ^’s belief that [cii A . . . A is true]. 

Another way of defining total beliefs is by the possible worlds semantics 
for doxastic modalities: A has in s the total belief that [p is true] if and only if 
the set of worlds that are epistemically accessible relative to s is precisely the 
set of worlds which satisfy p (this is the motivation for Levesque’s [93] study in 
autoepistemic logic). Compare this to the clause for the usual belief operator: 
A has the belief that [p is true] if and only if the set of epistemically accessible 
worlds is a subset of the set of worlds which satisfy p. 

In part IV we will state a precise definition of the notion of total belief 
for a specific class of cognitive architectures, which are possible architectures of 
our agent A. The notions of total perceptual belief and total central state belief 




Occurrent Beliefs and Dispositional Beliefs 



35 



may be explicated in the same way as the notion of total belief simpliciter, only 
the corresponding qualifications of beliefs as being perceptual or being central 
state have to be added (and analogously for all other kinds of total beliefs that 
we will refer to in the following sections). In the appendix, on pp. 314-318, a 
formal definition of all such kinds of total beliefs is given. 

The notion of a total belief will turn out to be necessary for the ex- 
plication of what a nonmonotonic inference is, and thus it is conceptually very 
important. But there is a problem affecting this notion as far as its practical 
applicability is concerned. This is because we - the epistemological observers 
- are virtually never in the position to say that all that an agent believes is 
that [(^ is true] , since this would usually presuppose almost God- like epistemic 
abilities. One cannot simply “take off the agent’s roof” and check the totality 
of her beliefs. On the other hand, (strict) conditionals of the form ‘if all that 
A believes is that [(p is true], then . . .’ will play a significant role in our the- 
ory, just as they have played a major role for the whole area of Nonmonotonic 
Reasoning. So how can we handle this problem? 

An improved approach would be to use conditionals like ‘if all that A 
relevantly believes in s is that p is true, then . . .’, since by this move we would 
only have to ascribe total relevant beliefs to A’s, which seems to be easier dealt 
with, because the set of beliefs which are relevant in s (concerning A’s actions, 
etc.) will generally be much smaller than the set of all of A’s beliefs in s. In 
this way, we would be able to compensate our metatheoretical observational 
incapabilities by introducing a sort of ceteris paribus clause on the level of the 
laws of our theory. However, we decide to trade simplicity for applicability, 
and to stick to strict conditionals employing the all-that-A-believes operator. 
In future versions of our theory we might inhance applicability by means of 
replacing such strict conditionals by normic laws of the form ‘normally, if all 
that A believes is that p is true, then . . .’, or even ‘normally, if A believes that 
p is true, then . . .’. 



3.3 Occurrent Beliefs and Dispositional Beliefs 

Occurrent beliefs are usually characterized as such: 

• (Explication of Occurrent Belief) 

1. Occurrent beliefs are causally active^; e.g., as Goldman [69], p.200, 
states: “they occupy attention and get actively invoked in cognitive 
tasks” 

^This is one of the usual ways of how to describe occurrent beliefs, but it is a highly vague 
one, even if it is put more precisely by defining: x is causally active (at t) iff there is a y s.t. 
X causes y (at t). 
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2. occurrent beliefs are non-dispositional states, and, as Audi[12], p.75, 
says: “they take place in the way events do . . . Contrary to dispo- 
sitions, the notion of an occurrent belief does not involve the notion 
of a class of initiating causes of the manifestations of the belief, or, 
put differently: occurrent beliefs are “stimulus-independent” (this is 
Chomsky’s term; it has been applied by Armstrong[8], p.l6, to be- 
liefs). This means that occurrent beliefs do not need to be activated 
by a stimulus, i.e., urged by a stimulus to change from an inactive 
state to an active one; but, of course, occurrent beliefs may have 
been caused by a stimulus. 

Armstrong [9], pp. 17-18, supplements that occurrent beliefs may man- 
ifest in indefinitely many ways, whereas dispositions always manifest in a 
uniquely specified way. But we regard this point to be of minor since only 
terminological importance: as it has also been emphasized by Armstrong, there 
are indeed philosophers (e.g., Ryle) who have, for several good reasons, a more 
tolerant view on dispositions according to which dispositions may manifest in 
a variety of ways (so-called “multi-track” dispositions). For the following con- 
siderations this topic will not be relevant. 

Sometimes occurrent beliefs are also understood in the way that they 
are occurrent in consciousness. We take the more general stance and “derive” 
from the list above: 

• (A Property of Occurrent Beliefs) 

Occurrent beliefs are not necessarily conscious states. 

E.g., my belief that there is a tree in my way might cause me to change 
direction (thus being causally active), although I am consciously engaged in a 
discussion with a friend walking nearby, and I am not consciously aware of 
the tree at all. Horgan&Tienson[81], p.l33, say on this topic: “any conscious 
mental state is an occurrent state. On the other hand, common sense psychol- 
ogy does leave open the conceptual possibility of occurrent beliefs . . . that are 
unconscious.” This is important in so far as we will not have to assume that 
our agent A is conscious in order to ascribe occurrent beliefs to her. 

Moreover, the following is also “entailed” by the above (given some 
further background assumptions which we are not going to state precisely): 

• (A Further Property of Occurrent Beliefs) 

Normally, “an occurrent belief is an episode in “active” or “short-term” 
memory” (Goldman[68], p.526) 

This is the case because in order to be causally active (i) an occurrent 
belief normally has to last for more than just an instant of time, but, on the 
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other hand, (ii) occurrent beliefs cannot be states of the long-term memory (if 
the agent has long-term memory capacities at all) , because such states are - or 
so we take the psychological concept of long-term memory to be defined - not 
by themselves causally active, but they are rather activated by some stimulus 
that is external to the agent’s long-time memory; but occurrent beliefs are 
stimulus-independent. 

Let us give two examples of how occurrent beliefs might be realized 
within a concrete cognitive architecture. We use the two typical architectures 
as examples which we will focus on later in part IV: 

Example 11 

1. (In terms of computers employing symbolic computation) 

The computer's occurrent beliefs are those computer states ^ in which some 
sentence (s) of the internal language is (are) contained in the knowledge 
base, ‘‘currently playing a causal role in the computing process, and so 
bringing about the computer’s ‘print-out’” (Armstrong [9], p.l8). When- 
ever we reconsider this example in the following, we will simply identify 
(for the sake of simplicity) this “internal language” with 
C U U for some implication signs and =>. 

2. (In terms of connectionist networks) 

“patterns of actual, occurrent activation in the network . . .play the role. . . of 
occurrent mental states” (Horgan&Tienson[81], p.l32). 

In the next chapter we will regard inferences as processes which lead 
from occurrent belief states to further occurrent belief states. 

The notion of dispositional belief can be explicated as follows: 

• (Explication of Dispositional Belief) 

1. Dispositional beliefs may be “quiescent”, i.e, causally inactive 

2. dispositional beliefs are dispositional states, i.e., contrary to occur- 
rent beliefs they have initiating causes which trigger their manifesta- 
tions; only appropriate circumstances turn them from being inactive 
to being active. 

Dispositional beliefs have a similar way of manifestation as, e.g., brit- 
tleness, which is manifested by the brittle object’s breaking if struck. Since 
dispositions are an agent’s tendencies to behave in a certain way under appro- 
priate circumstances, and since there seems to be no instance of a “tendency” 
being a conscious states, being “in” the mind or being called up “to” the mind, 
it follows that: 
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• (A Property of Dispositional Beliefs) 

“Dispositional beliefs . . . are unconscious states” (Horgan&Tienson [81], 
p.l33) 

(but dispositions may of course be tendencies to cause conscious states). 

Just as in the case of occurrent beliefs, also dispositional beliefs may 
usually be characterized by their specific relation to the concept of memory. 
Since dispositional beliefs are only activated in certain circumstances, they 
have to be memorized by an agent until such circumstances do prevail; but 
this may take some time, and so a kind of memory storage is needed which is 
independent of whether it is currently causally active or not. So we have: 

• (A Further Property of Dispositional Beliefs) 

If an agent has both short- and long-term memory capacities, then nor- 
mally a dispositional belief consists in some form of long-term memory 
storage (see Goldman [68], p.526). 

Analogously as for occurrent beliefs, let us state two examples of how 
dispositional beliefs might be realized within a cognitive architecture: 

Example 12 

1. (In terms of computers employing symbolic computation) 

The computer’s dispositional beliefs are those computer states, in which 
some sentence (s) of the internal language is (are) contained in the back- 
ground knowledge base, possibly without being read or being worked on, 
but ^‘waiting” to be read by a program. 

2. (In terms of connectionist networks) 

“/n connectionist models, dispositional intentional state-types . . . are a 
matter of a network’s (weights being set to produce a) tendency to gen- 
erate occurrent representations under appropriate circumstances” (Hor- 
gan&Tienson[8I], p.l32). Which dispositional beliefs a network has, de- 
pends thus on the topology of the net, and on the association of weights 
with the network connections. 

In the next chapter we will see that every inference is based on a general 
belief, which is in turn a dispositional belief. 

In [81], Horgan&Tienson characterize the dispositional belief that [(f 
is true] as the specific tendency to generate token representations of the very 
proposition that [(p is true]. Furthermore, they speak of so-called morphological 
intentional states, s.t. “Morphological possession of intentional content M is 
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a matter of the cognitive system’s being disposed, by virtue of its persisting 
structure rather than by virtue of any occurrent states that are tokens of M, 
to undergo state-transitions that are systematically appropriate to content M 
- and to do so, at least much of the time, without generating a token of M 
during the process.” (p.l32). As it can be seen from above, we understand 
the term ‘dispositional’ more generally than Horgan&Tienson do, s.t. both the 
case of dispositional beliefs in the sense of Horgan&Tienson and the case of 
what Horgan&Tienson call ‘morphological beliefs’ are included. If an agent’s 
dispositional belief that [(f is true] is a tendency to cause, under appropriate 
circumstances, the occurrent belief that [(f is true], and these circumstances 
actually hold, then this a case where the agent both dispositionally and occur- 
rent ly believes that [(p is true]. Therefore: 

• (A Property of Occurrent and Dispositional Beliefs) 

Occurrent and dispositional beliefs may have the same content. 

Prom this follows that a belief’s being occurrent or dispositional cannot 
be determined just by looking at its content; additionally, it has to be checked 
whether this content is held occurrently or dispositionally (or both). An agent 
may have the dispositional “long-term” belief the content of which is expressed 
by (/?, and - from time to time - also the occurrent “short-term” belief with the 
same content. 

Finally, since we have explicated the notions of occurrent belief and 
dispositional belief quite broadly, and since we do not know of any real-world 
instance of a belief the content of which would be held neither occurrently nor 
dispositionally, we have: 

• (A Property of Occurrent and Dispositional Beliefs) 

Every belief is occurrent, or dispositional, or it is the superposition, i.e., 
the union of an occurrent and a dispositional belief of the same content. 

When we say ‘every belief . . .’, we use the metalinguistic predicate ‘a 
belief’ in order to quantify over all belief states (e.g., of A); when we speak of 
‘the belief of A that [<p is true] ’ we use a metalinguistic singular term (like b in 
the object language) in order to denote a certain state of belief of A. A has in s 
the latter belief state, i.e., the belief that [(p is true] iff A has in s the occurrent 
belief that [ip is true] or A has in s the dispositional belief that [(p is true]. 

We say ‘believes occurrently/dispositionally that [ip is true]’, or ‘has 
the occurrent /dispositional belief that [p is true]’ in order to make the distinc- 
tion between occurrently holding a belief, and dispositionally holding a belief, 
explicit; on the object language level, we introduce the operators J3^, AB^, 
AB^ for precisely this purpose to our vocabulary, we extend our language ac- 
cordingly, and we say: 
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• (Notations for Ascribing Occurrent/Dispositional Beliefs/Total Beliefs to 

1. s \= B^cp iff A occurrently believes in s that [p is true] 

2. s \= iff A dispositionally believes in s that [p is true] 

3. 5 N AB^p iff all that A occurrently believes in s is that [p is true] 

4. 5 N AB^p iff all that A dispositionally believes in s is that [p is 
true] . 

It is obvious how to introduce the operators 6^, tb^ and to 

extend the language in a similar way. When we say ‘an occurrent belief of A’ 
and ‘a dispositional belief of A’ we use metalinguistic predicates the extensions 
of which are the set of A’s occurrent beliefs and the set of A’s dispositional 
beliefs, respectively; again we do not use analogous predicates in the object 
language. 

As indicated in section 3.1, we distinguish between A’s perceptual sub- 
system and A’s central subsystem. The former system is responsible for A’s 
having perceptual beliefs, the latter system for A’s having central state beliefs; 
however, perceptual beliefs and central state beliefs are states of A and not of 
her perceptual subsystem or of her central subsystem. 

Perceptual beliefs are beliefs about the current state of the environ- 
ment; among others, they cause further central beliefs. Perceptual beliefs are 
not just held dispositionally by an agent in order to get activated by certain 
stimuli, but they are rather caused by stimuli. Of course, an agent may be 
disposed to believe that [p is true] in the circumstances of certain external 
stimulation - but a disposition to believe that [p is true] is different from a 
dispositional belief that [p is true] (see Audi[12], p.202). So we have: 

• (A Further Property of A’s Occurrent Beliefs) 

Perceptual beliefs are occurrent beliefs. 

This may be put formally as such (using the operator B^ from the last 
chapter): for all parameter-settings s of A, if s 1= B^p then s N B^p. 

Now let us turn to central state beliefs: here the question of whether a 
central state belief is occurrent or dispositional is more difficult to settle, since 
we take the following assumption: 

• (Assumption on A’s Central System) 

A’s central system consists of two subsystems differing concerning their 
long-time behaviour: 
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1. the occurrent central system of the central state parameters that 
normally change fastly and abruptly; call this the ‘occurrent central 
system’ 

2. the dispositional central system of the central state parameters that 
normally change slowly and gradually; call this the ‘dispositional 
central system’. 

The distinction between ^’s occurrent central system and A^s disposi- 
tional central system is easily explained in terms of specific cognitive architec- 
tures: 

Example IS In a computer (Turing machine) the system of central state pa- 
rameters is divided as such: 

1. occurrent central system: 

the dynamics of the tape entries, of the so-called internal state, and of 
the position of the reading head: all of them usually change fastly and 
abruptly with every single computation step of the computer 

2. dispositional central system: 

the “zero” dynamics concerning the program of the computer, which is 
usually not changed at all during a computation period. 

Example 14 In a connectionist network the system of central parameters is 
divided as such: 

1. occurrent central system: 

the dynamics of the activity values of the nodes, which usually change 
fastly and abruptly with every single computation step of the network 

2. dispositional central system: 

the dynamics of the weights of the edges ( and perhaps the topology of the 
network), which usually change slowly and gradually under the influence 
of long-term learning algorithms; the topology of the network is often even 
not changed at all. 

Occurrent central states are the states of the occurrent central system: 
they may be regarded as sets of parameter-settings of the occurrent central 
system. Dispositional central states are the states of the dispositional central 
system: they may be regarded as sets of parameter-settings of the dispositional 
central system. Every parameter-setting of the central system may be consid- 
ered as a pair of the parameter-setting of the occurrent central system and the 
parameter-setting of the dispositional central system. Central states have been 
defined in section 3.1 as sets of such parameter-settings of the central system. 
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The parameter-setting of the central system is part of the parameter-setting of 
^’s overall system. 

The special role of the dispositional central system - as opposed to 
the perceptual system or the occurrent central system - is expressed by the 
following assumption: 

• (Assumption on A’s Dispositional Central System) 

Every parameter-setting of the dispositional central system is asso- 
ciated with 

— a function nc{s^'^) that maps every pair (5^,5^’^), consisting of a 

parameter-setting of the perceptual system and a parameter- 
setting of the occurrent central system, to a parameter-setting 
{^new^ ^new) Central system, consisting of a parameter-setting 

^new occurrent central system and a parameter-setting of 

the dispositional central system. If 5^’^ is the “current” parameter- 
setting of the dispositional central system, if s'^ is the “current” 
parameter-setting of the perceptual system, and if 5^’^ is the “cur- 
rent” parameter-setting of the occurrent central system, then: 

is the “next” parameter-setting of the occurrent central system, and 
^new is “next” parameter-setting of the dispositional central sys- 
tem. Call nc the ‘next central state function’. 

— a function na{s^'^) that maps every pair (5^,5^^’^), consisting of a 

parameter-setting s'^ of the perceptual system and a parameter- 
setting 5^’^ of the occurrent central system, to a parameter-setting 
^new action system. If is the “current” parameter-setting 

of the dispositional central system, if 5^’ is the “current” parameter- 
setting of the perceptual system, and if is the “current” para- 
meter-setting of the occurrent central system, then: is the 

“next” parameter-setting of the action system. Call na the ‘next 
action state function’. 

Normally, 5^’^ = ^new^ since the parameter-settings of the dispositional 
central system change normally only slowly and gradually (if they change 
at all). 

Normally, ^ ^new^ since the parameter-settings of the occurrent cen- 
tral system change normally quickly and abrupty. 

It is thus the parameter-setting of A’s central dispositional system that 
determines the “cognitive dynamics” of A, since the next central state function 
and the next action state function is determined by the parameter-setting 5^’^ of 
A’s central dispositional subsystem. The parameter-settings of A’s perceptual 
system and of A’s central occurrent system are the arguments of the functions 
nc{s^'^) and na{s^^^). 
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Perceptual beliefs have been assumed to be the counterpart states of 
perceptual states on the level of the whole system, s.t., whether the system A 
has a certain perceptual belief is only dependent on whether she is in a certain 
perceptual state. We now make the corresponding assumptions concerning oc- 
current central state beliefs and occurrent central states on the one hand, and 
dispositional central state beliefs and dispositional central states on the other 
hand. 

But first some further object language notations: if we want to ascribe 
occurrent central state beliefs, dispositional central state beliefs, total occurrent 
central state beliefs, and total dispositional central state beliefs, accordingly, 
we use the operators AB^'^ ^ AB^'^^ which we add to our vocabulary 

(extending the language), and we choose the usual form of belief ascription (e.g., 
s N iff A occurrently centrally believes in s that [ip is true], etc.). In order 

to refer to the beliefs of the relevant kinds, we use the operators 

which we add to our vocabulary, too. An occurrent central state belief 
of A is any such occurrent central state belief of A that [ip is true], for some 
ip G CuC-^ (and some implication signs ^ and ‘an occurrent central 

state belief of A’ is a metalinguistic predicate which will not be formalized in 
the object language); accordingly, for dispositional central state beliefs. 

Occurrent central state beliefs and dispositional central state beliefs 
are - by the property of beliefs stated on p.26 - not states of a subsystem of 
A, i.e., neither of the occurrent central system nor of the dispositional central 
system, but they are states of the system A herself', thus, occurrent central state 
beliefs are no occurrent central states, which are the states of the occurrent 
central subsystem, and the same holds for dispositional central state beliefs 
and dispositional central states. However, whether A is in a certain occurrent 
central state belief should - by assumption - only depend on whether A is in a 
corresponding occurrent central state, and analogously for dispositional central 
state beliefs and dispositional central states. We will formulate this similarly 
as we have done for perceptual beliefs and central state beliefs in section 3.1: 

• (Assumption on A’s Occurrent Central State Beliefs) 

For every occurrent central state belief of A there is a unique occurrent 
central state, s.t. A has the former occurrent central state belief if and 
only if her occurrent central system is in the latter state. 

Occurrent central state beliefs are thus the counterpart states of occur- 
rent central states on the level of the whole system, but they are not occurrent 
states themselves. 

• (Assumption on A’s Dispositional Central State Beliefs) 

For every dispositional central state belief of A there is a unique disposi- 
tional central state, s.t. A has the former dispositional central state belief 
if and only if her dispositional central system is in the latter state. 
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Dispositional central state beliefs are thus the counterpart states of 
dispositional central states on the level of the whole system, but they are not 
dispositional central states themselves. 

By our explication of the notions of occurrent and of dispositional belief 
from above, we have: 

• (A Further Property of A’s Occurrent Beliefs) 

Occurrent central state beliefs are occurrent beliefs. 

This may be put formally as such: for all parameter-settings s of A, if 
s N then s 1= B^^p. Accordingly: 

• (A Further Property of A’s Dispositional Beliefs) 

Dispositional central state beliefs are dispositional beliefs. 

More formally: for all parameter-settings 5 of A, if s N then 

s ^ B^ip. 

A central state belief may now be further characterized as either an 
occurrent central state belief, or a dispositional central state belief, or the super- 
position, i.e., the union of an occurrent central state belief and a dispositional 
central state belief of the same content. Thus, A has in s the central state belief 
that [p is true] iff A believes in s occurrently centrally that [p is true], or A 
believes in s dispositionally centrally that [p is true] . The central state belief 
(simpliciter) of A that [p is true] is just the union of A’s occurrent central 
state belief that [p is true] and A’s dispositional central state belief that [p is 
true] (for some p ^ C oy p ^ C-^ U £=>, for some implication signs ^ and =>); 
it is a substate of the latter two belief states. Central state beliefs, occurrent 
central state beliefs, and dispositional central state beliefs may thus have the 
same content E.g., a scientist might have the central state belief that [arsenic 
is poisonous], s.t. (a) this belief has a superstate which is an occurrent central 
belief state with the same content, since the scientist has, e.g., just found out 
that this is true, and since she currently thinks about it; (b) this belief has a 
superstate which is a dispositional central belief state with the same content, by 
which, whenever the scientist believes that something is arsenic, she infers that 
it is poisonous. This theory of beliefs is therefore more fine-grained than any 
theory in which there is just a plain dichotomy of occurrent and dispositional 
beliefs. 

A’s occurrent belief that [p is true] can now be defined as the set of 
parameter-settings of A in which A perceptually believes that [p is true] or in 
which A occurrently centrally believes that [p is true]. A’s dispositional belief 
that [p is true] is simply identical to A’s dispositional central state belief that 
[p is true]. An occurrent belief of A is either the perceptual belief of A that 
[p is true], or the occurrent central state belief of A that [p is true], or A’s 
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occurrent belief that [(p is true], which is the union (“superposition”) of the 
perceptual belief of A that [p is true] and the occurrent central state belief of 
A that [p is true] (for some p e C or p e U , for some implication signs 
and =^). A dispositional belief of A is a dispositional central state belief of 
A. 

Finally, A’s belief that [p is true] can be defined as the set of parameter- 
settings of A in which A perceptually believes that [p is true] or in which A 
centrally believes that [p is true] ; equivalently: in which A occurrent ly believes 
that [p is true] or in which A dispositionally believes that [p is true]. A belief 
of A is either a perceptual belief of A, or a central state belief of A, or an 
occurrent belief of A, or a dispositional belief of A, or rather A’s belief that 
[p is true], i.e., the union of the perceptual belief of A that [p is true] and the 
central state belief of A that [p is true] (equivalently: the union of the occurrent 
belief of A that [p is true] and the dispositional belief of A that [p is true]) (for 
some p e C or p e U for some implication signs ^ and =>). 

In summary, we get the following picture of the “family” of beliefs 
(on the left-hand side we have listed singular terms referring to certain belief 
states, in the first line we have listed general terms the extensions of which are 
classes of belief states, ‘t’ expresses membership, ‘ ’ expresses non-membership; 
‘p’ is for ‘perceptual’, ‘o’ is for ‘occurrent’, ‘d’ is for ‘dispositional’, ‘c.st.’ is for 
‘central state’, ‘bel.’ is for ‘belief’): 
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3.4 Singular Beliefs and General Beliefs 

Singular beliefs are defined by their contents: 

• (Explication of Singular Belief) 

Singular beliefs are those beliefs the content of which may be expressed 
by a singular sentence of the language that is chosen to subserve the 
ascription of belief content. 
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In the case of our cognitive agent A we will use the metalinguistic 

predicate ‘a singular belief of the extension of which is the set of singular 

beliefs of A. 

General beliefs are defined complement arily to singular beliefs: 

• (Explication of General Belief) 

General beliefs are those beliefs the content of which may only be ex- 
pressed by a general sentence of the language that is chosen for the as- 
cription of belief contents. 

In the case of our cognitive agent A^ we will use the metalinguistic 

predicate ‘a general belief of A’ the extension of which is the set of general 

beliefs of A. 

Although Armstrong [9] does not use the terminology of ‘singular be- 
liefs’ll, he actually speaks of singular beliefs when he refers to the “beliefs 
concerning things at particular times and places” , or to the “beliefs concerning 
particular matters of fact” (Armstrong [9], p.5). 

General beliefs are described by Armstrong [9] as the beliefs about “gen- 
eral principles” (p.89), or about “general propositions” (p.90). He gives the 
following examples for general beliefs: the belief that arsenic is poisonous, the 
belief that every even number is the sum of two prime numbers (Armstrong [9], 
p.5); the belief that whenever Sally gets angry, she throws the crockery around; 
(Armstrong [9], p.86); the belief that the decapitated are dead (Armstrong [9], 
p.89). Armstrong claims (not quite correctly if we look at his examples) to re- 
strict his discussion of general beliefs to those the contents of which may be ex- 
pressed by universally quantified material conditionals of the form Vx(P(x) ^ 
Q{x))^ but he does so only by adding that “it may be that there are well- formed 
law-like propositions which cannot be rendered in this way” (Armstrong [9], 
p.86). Accordingly, we understand the concept of a general belief more gen- 
erally in the way that its content may be expressed by any general sentence^ 
including both strict and defeasible ones. E.g., most of us have the general be- 
lief that birds can fiy, but ‘birds can fiy’ is in this case no universally quantified 
material implication but rather a defeasible one. If the content of a general 
belief is expressed by a strict conditional, we call it a strict belief; if it is ex- 
pressed by a defeasible conditional, we call it a defeasible belief. If the content 
of a general belief is expressed by a conditional of the form Vx(P(x) — > Q(x)), 
i.e., by a universal sentence, we call the belief a ‘universal belief’; if the content 
of a general belief is expressed by a high probability conditional, we call the 

II Armstrong [9] only speaks of these beliefs as being opposed to general beliefs; sometimes 
he uses the term ‘particular beliefs’ for our term ‘singular beliefs’. But we think that our 
choice of terminology matches the usual distinction between general and singular sentences 
more properly. 
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belief ‘a high probability belief’, whereas if the content is expressed by a normic 
conditional, we say that the corresponding belief is a ‘normic belief’. Just as 
we have constrained ourselves to general sentences which are either strict con- 
ditionals or defeasible conditionals, we also consider only general beliefs the 
contents of which may either be expressed by strict or defeasible conditionals. 
E.g., general beliefs in the truth of sentences that consist of several nested 
universal and existential quantifications are disregarded. 

By the explication of the notions of singular and of general belief (note 
the term ‘only’ above) stated before, we immediately have: 

• (Property of Singular and General Beliefs) 

There is no belief which is both singular and general. 

Morever, under certain circumstances, the classes of singular and gen- 
eral beliefs are even exhaustive: 

• (Property of Singular and General Beliefs) 

If the language, which is used for the ascription of belief contents, is such 
that every sentence which is not singular, is general, then every belief is 
either singular or general. 

This is the case, because, by our explication, every belief has a content 
which may be expressed by a sentence of the language that is chosen for the 
ascription of belief contents. Since we have restricted this language to either a 
language of singular sentences (£), or a language of general sentences (£_^U>C^, 
for some implication signs ^ and => that we might employ), the classes of 
singular beliefs and general beliefs are in our case indeed exhaustive. 

The metalinguistic predicates ‘a singular belief of A’ and ‘a general 
belief of A’ may be combined with each of the predicates from the previous 
sections. E.g.: a perceptual singular belief of A is a perceptual belief of A which 
is additionally singular. We will now focus on the possible combinations of ‘a 
singular belief of A’/‘a general belief of A’ on the one hand, and ‘an occurrent 
belief of dispositional belief of A’ on the other. 

Armstrong [9] claims that if a proposition that is expressed by a general 
sentence is believed by an agent, then it is necessarily dispositionally believed, 
and it is necessarily not believed occurrently. According to Armstrong, such 
general beliefs are manifested in a “single, logically central” sort of way, and 
their manifestations are identified by Armstrong as the processes of drawing 
certain inferences. If this were true, the concept of general belief would be of 
utmost importance for a theory of inference, since it would follow: an agent 
has the general belief that [a[x] p[x] is true] (or [a[x] => f3[x] is true]. 
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respectively) iff the agent is disposed to draw the conclusion that [/3[a] is true], 
whenever the circumstances are such that she believes that [a [a] is true]. 

Although we agree with Armstrong in so far as that the notion of a gen- 
eral belief is important for the notion of inference, we think that Armstrong’s 
account of general beliefs has four deficits: 

(i) if the content of a general belief is expressed by a defeasible con- 
ditional, then this belief cannot be the disposition to draw an inference from 
a belief state to a further belief state, but rather it has to be a disposition to 
draw an inference from a total, say, total perceptual belief or total occurrent 
central belief state, to a further belief state; otherwise, since we believe that 
birds can normally fly, we would always draw the conclusion that there is some- 
thing which can fly, whenever we believe that there is a bird right in front of 
us - even if we additionally believed this bird to be a penguin. We deal with 
this problem in more detail in the next chapter. 

(ii) General propositions can also be entertained in mind, or thought of 
consciously, and hence they can also be the contents of occurrent beliefs. In this 
case, Armstrong would probably say that thinking of a proposition is different 
from believing the very proposition. E.g., beliefs are necessarily action-guiding, 
but mere thoughts are not (for more on this distinction see Armstrong [9], chap- 
ter 5, on “Hume’s Problem”). However, we can still see no reason why general 
propositions might not also be the contents of occurrent beliefs. E.g., if a med- 
ical doctor believes dispositionally that arsenic is poisonous, and she also has 
the occurrent belief that the substance in front of her is arsenic, then she may 
be disposed to create the occurrent belief that the substance is poisonous, 
by means of creating first the occurrent belief that arsenic is poisonous, and 
drawing some inference on the basis of the two latter occurrent beliefs. In this 
case, the medical doctor would have an occurrent belief, the content of which 
is expressed by a general sentence (‘arsenic is poisonous’), and it may not be 
expressed by a singular one. But note that we have argued in section 3.3 that 
the activation of the dispositional belief that is true] does not necessar- 
ily lead to the occurrent belief that [(p is true]. In chapter 4 we will argue 
that inference always involves a dispositional general belief, but not always 
an occurrent general belief. Also Horgan&Tienson[81] consider the possibility 
of general and occurrent beliefs (this is the very reason why they introduce 
their notion of a morphological belief). So, at least, Armstrong’s straightfor- 
ward characterization of general beliefs as dispositional beliefs is not beyond 
any doubt. According to the last section, if we say that an agent has a general 
belief, this is actually short for: the agent has a general belief which is held 
occurrently, or the agent has a general belief, which is held dispositionally, and 
the ‘or’ is not meant to be exclusive. 

(iii) Armstrong claims that that general belief that [a[x] /3[x] is true] 

always involves a disposition to conclude that [P[a] is true] under the conditions 
that she believes that [a[a] is true]. But what if the agent indeed has the general 
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belief that [a[x] /3[x\ is true] and also the belief that [a[a] is true], but when 

her cognitive capacities are completely distracted by something important she 
is doing? Or what if she simply does not want to draw that conclusion, for 
whatever reason? In those cases the general belief involves a disposition, but 
the disposition is only activated if an instance of the antecedent is believed, and 
certain additional conditions are satisfied that are independent of the agent’s 
believing the premises including nothing else explicitly prevents the disposition 
to be activated. Thus, there seem to be two provisos concerning Armstrong’s 
account, namely, the conditions under which the disposition is activated will 
generally not only comprise the agent’s belief in the premises, and furthermore 
a certain ceteris paribus clause has to be added. But since these provisos are 
difficult to state more precisely, and since a precise formulation would demand 
both an increase of terminology and sophistication, we decide to ignore them 
for the sake of simplicity. 

(iv) We follow Armstrong in the respect that if an agent has the general 
belief that [a[x] — ^ (3[x\ is true] or the general belief that [a[x] (3[x\ is true] for 

the implication signs ^ and => that we are mainly interested in^ i.e., universal, 
high probability or normic implication signs, then it seems to be impossible (at 
least under normal circumstances) that the agent is not disposed to draw the 
conclusion that [P[a] is true], under the condition that (all that) she believes 
that [a[a] is true]. Otherwise, we would hardly be justified to say that the agent 
has a belief the content of which is the proposition expressed by a[x] l3[x]or 
a[x] => l3[x]. After all, (i) we have assumed in chapter 2 that a special com- 
bination of universal instantiation and modus ponens, where the antecedent is 
satisfied by precisely one object, is indeed truth-preserving for all conditionals 
that we consider, and (ii) we have seen that this implies the validity of the 
combined instantiation/modus ponens rule for all strict implication signs; let 
us additionally assume that (hi) we restrict the set of defeasible implication 
signs to a set, s.t. the combined rule is “plausible” for every member of this 
set. As far as modus ponens is concerned, it is hardly an exaggeration to say 
that modus ponens is “the most fundamental of all deductive inference forms” 
(Adams[6], p.l20). In addition to this appraisal of modus ponens as a rule of 
inference in logic, it was found out in psychological tests that “nearly 100% of 
subjects make the valid modus ponens inference” (Eysenck&Keane[46], p.451, 
where also an overview of such experimental findings may be found) , and thus it 
is plausible to assume that “modus ponens represents a fundamental, primitive 
cognitive operation” (Goldman[69], p.89) at least within human beings; but 
if it is indeed “fundamental” and “primitive”, then this is probably not only 
the case for human cognitive architectures, but also for those of less complex 
agents. Similar claims can be made in favour of the instantiation rule for strict 
general sentences, and - though this is less familiar - also for defeasible general 
sentences, as long as the ‘all that is believed’ clause that is associated with the 
latter is not neglected. All this does not indicate that people or animals actu- 
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ally reason by some explicit mental rule-like counterpart to this combination 
of the logical rules of modus ponens and universal instantiation, but what the 
psychological findings indeed indicate is that if people or animals believe dis- 
positionally in the truth of a[x] /?[x], and (totally) occurrently in the truth 
of a[a], they also start to believe in the truth of /5[a].** 

But note that a general belief is not identical to the disposition to 
draw a certain inference from a premise to a conclusion, or, more general, to 
change from a premise belief to a conclusion belief: e.g., if and -^2 are 
implication signs which have a different semantics, then the dispositional belief 
that [a[x] / 3 [x] is true] and the dispositional belief that [a[x] -^2 P[x] is true] 

will generally be different states (e.g. distinguished in the way that these states 
may be acquired or revised), but both of these states will be dispositional states, 
s.t. the agent is disposed to draw a direct inference from the belief that [a[a] 
is true] to the belief that [P[a] is true]. Moreover, it is possible that an agent is 
disposed to draw the conclusion that [7 [a] is true], whenever the circumstances 
are such that she believes that [a[a] is true], but where the agent does not have 
the general belief that [o;[x] ^ y[x] is true]: we think of a case where the agent 
has the general belief that [a[x] / 3 [x] is true], s.t. she is disposed to draw 

the conclusion that [P[a] is true] whenever the circumstances are such that she 
believes that [a[a] is true], and where the agent has furthermore the general 
belief that [P[x] j[x] is true], s.t. she is disposed to draw the conclusion that 
[7 [a] is true] whenever the circumstances are such that she believes that [/3[a] is 
true], but where the agent’s general beliefs are not rationally closed with respect 
to a[x] ^ / 3 [x], / 3 [x] ^ 7[x], and a[x] ^ y[x]. In such a case, the agent would be 
disposed to draw the conclusion that [7 [a] is true] whenever the circumstances 
are such that she believes that [a [a] is true], although the agent does not have 
the general belief that that [a[x] j[x] is true]. The latter disposition would 
not be a disposition to draw a direct inference within one immediate inference 
step, but a disposition to draw an indirect, i.e., a composite inference within 
two direct inferences. 

Thus, let us assume for any cognitive agent A that we consider only 
one direction of Armstrong’s equivalence between general belief ascriptions on 
the one hand and inferential disposition ascriptions on the other: 



• (Assumption on A’s General and Dispositional Beliefs) 

If A has the general belief that [a[x] j 3 [x] is true] ([o;[x] P[x] is 



**In the case of explicit rule application, one believes both occurrently and dispositionally 
that [a[x] — ^ P[x] is true], and the belief corresponds logically to a premise formula. An 
inference is drawn from a[x] /3[x] and a[a] to (3[a\. The inference disposition is realized by 
means of an occurrent general state. 

In a rule is not explicitly applied, one believes a[x] ^ (3[x] only dispositionally, and the 



belief corresponds logically to rule of the form 



a[t 


1 


m 


1 



■ where t is an arbitrary singular term. 



The rule is not necessarily represented in the agent. An inference is drawn by the application 
of this rule to the occurrent premise belief in the truth of a[a\. 
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true]), then A also has the dispositional belief that [a[x] p[x] is true] 
([a[x] => /3[x] is true]), s.t. the agent is disposed to draw a direct inference 
leading to the belief that [/?[a] is true], under the circumstances that (all 
that) she believes perceptually, or, occurrently centrally, is that [a[a] is 
true] . 

Or, more formally: if s N B{a[x] P[x]) {s 1= B{a[x] P[x])) then 

s N B^{a[x] /3[x]) {s N B^{a[x] ^ P[x])), s.t. the agent has the 

disposition described above. 

The last property will be stated more precisely in the next section. 
What is actually expressed by this property is that an agent cannot have the 
occurrent general belief that [a[x] P[x] is true] ([a[x] P[x] is true]) without 
also having the dispositional general belief that [a[x] p[x] is true] ([ce[x] ^ 
P[x] is true]), i.e., every dispositional general belief is a substate of the occurrent 
general belief of the same content. E.g., if s b B{Bird{x) CanFly{x))^ then 
A has in 5 a certain general belief: the general normic belief that birds normally 
can fly. Consequently, A is disposed to infer that [CanFly[a] is true], if all that 
A believes (perceptually, or, occurrently centrally) is that [Bird[a] is true]. 

Armstrong [9] does not indicate whether, according to his view, there 
are any singular beliefs that are dispositional, but all the examples he gives 
for dispositional beliefs, are indeed examples of general beliefs. Moreover, the 
typical instances of singular beliefs are indeed occurrent ones: e.g., perceptual 
beliefs about the current state of the environment are virtually always singular 
beliefs, since they are about a particular place at a particular time. At the same 
time, according to the last section, such perceptual beliefs are also occurrent 
beliefs. Thus we have: 

• (A Property of A’s Singular Beliefs) 

Perceptual beliefs are (both occurrent and) singular beliefs. 

E.g., A’s perceptual belief that [Bird{a) is true] is both an occurrent 
and singular belief. 

But the answer to the question ‘are there singular dispositional beliefs’ 
should nevertheless be affirmative: e.g., an agent might have the singular belief 
on the 9^^ of February that the birthday of his girlfriend is the 24^^ of December, 
and so he may be disposed to give her a present on the latter day, although 
he actually does not think about it on the 9^^ of February, and the belief has 
also no other consequences at that point of time. Of course, there does not 
seem to be a “logically central” way of manifestation that is associated with 
a singular belief of that kind, and this is probably the reason why Armstrong 
claimed dispositional singular beliefs to be non-existent, but this is just because 
Armstrong has a very peculiar understanding of the term “disposition” ; on the 
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other hand, it is clear from our example that singular beliefs may very well be 
dispositional according to a broader understanding of the term. 

However, given the assumptions on A that we have made up to now, it 
is indeed the case that there will normally not be any belief of A which is both 
singular and dispositional: the reason for this is not that singular dispositional 
beliefs are impossible per se, but rather that we have assumed in chapter 2 
that A believes that a is the very object which is right before her (at the 
time of her belief). Moreover, every singular belief of .A is - by assumption - 
a belief that concerns the object denoted by a. Thus, every singular belief of 
A is a belief about the properties of a specific object at a specific time and 
place. Of course, such singular beliefs about a specific object might also have 
dispositional consequences; e.g., A might believe that [Penguin{a) Bird{a) 
is true], and by that A is disposed to infer that Bird{a) is true whenever she 
believes that [Penguin{a) is true]. However, in this case A might just as well 
have the general belief that [Penguin{x) Bird{x) is true] with the same 
dispositional consequences, and she might have the general belief for the same 
reasons that she has got the singular belief that [Penguin{a) Bird{a) is 
true]. Let us assume that A’s cognitive architecture is of such a kind that 
she produces in such a case the general belief right from the beginning, and 
not its (potentially infinitely many) singular instance beliefs. It follows that 
A’s singular beliefs will normally be occurrent beliefs. For similar reasons, A’s 
dispositional beliefs will normally be general beliefs, since dispositional beliefs 
are normally about objects in general, and not about the specific object in the 
immediate environment of A. 

So we can take the following plausible assumptions (given our previous 
assumptions, like: the exclusion of memory beliefs, etc.): 

• (Assumption on A’s Singular Beliefs) 

For every formula 7 s.t. 7 G C\ 

if s N B{^) then s 1 = ^^( 7 ). 

• (Assumption on A’s Dispositional Beliefs) 

For every formula 7 s.t. 7 G £ or 7 G U C^\ 

if s N B^{"y) then 7 is of the form a[x] /3[x] or of the form a[x] => /?[x], 

for some implication signs and 

Thus, by the claims from above we also have: 

• (Equivalence of A’s Having Dispositional&General and A’s Having Gen- 
eral Beliefs) 

For every general sentence a[x] j3[x\ G {a[x] ^ /3[x] £ C^): 

A has the dispositional (general) belief that [a[x] P[x] is true] ([o^[x] 
/3[x] is true]) iff A has the (general) belief that [a[x] P[x] is true] 
{[a[x] => /3[x] is true]). Or, more formally: 
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for all parameter-settings s of 
s \= B^{a[x] l3[x]) (s 1= B^{a[x] => P[x])) iff 

s N B{a[x] P[x]) {s B{a[x] => l3[x])), 

where the dispositional belief that [a[x] — ^ (3[x] is true] ([o:[a:] P[x] 
is true]) comprises the disposition to draw an inference leading to the 
belief that [l3[a] is true] under the circumstances that s 1= J5(o;[a]) (s N 
AB{a[a])), where ‘B’ or ^AB^ have to be supplemented by the index ‘p’ 
or ‘c, o’; but this will be put more precisely in the next chapter. 

Thus, A’s dispositional belief that [a[x] P[x] is true] is not only a 
superstate of A^s belief that [a[x] P[x] is true], as we have pointed out in 
the last section, but it is even identical to .A’s belief that [a[x] f3[x] is true]; 

this is for the special content of the two beliefs. 

Summing up, we are led to the following picture as far as our cognitive 
agent A is concerned (‘3’ stands of course for the existence of beliefs of the 
corresponding belief type): 





Singular 


General 


Occurrent 


3 (e.g. perceptual) 


3 (e.g. in the medical doctor example on p.48) 


Dispositional 


-i3 (normally) 


3 (e.g. the typical general beliefs) 



In the subsequent chapter we will regard inferences as processes leading from 
singular beliefs to further singular beliefs, where the inference is “based” on a 
general dispositional belief. 




Chapter 4 
INFERENCE 



4.1 Introductory Remarks 

According to folk psychology, an inference is a mental process leading from 
the belief or total belief of (the propositions expressed by) the premises to the 
belief of (the proposition expressed by) a conclusion, s.t. the latter belief is 
being created. Call the former beliefs ‘premise beliefs’ and the latter belief the 
‘conclusion belief’. 

By constraining the class of inferences to processes which lead from 
initial beliefs to further beliefs, we exclude the cases of suppositional inferences 
(conditional inferences; inferences by indirect assumption) right from the start. 
But these latter cases of inferences are closely resembling the primary case of 
inferences starting from beliefs, and therefore we will restrict ourselves to the 
primary case (just as Armstrong does when he deals with inference in his [8]). 

Sometimes, also such mental processes are called ‘inferences’ where the 
agent already has got the conclusion belief, although the inference is not yet 
terminated, and where the premise beliefs only sustain (“maintain”, “support” 
- but not with a normative connotation) the conclusion belief: in such a case, 
we will not speak of an inference, but we will rather say that the premise 
beliefs are the agent’s reason to have the conclusion beliefs - see the discussion 
below. We use the term ‘inference’ only if the conclusion belief is also created 
by the premise beliefs, i.e., if it is caused by the premise beliefs at a time 
t, s.t. the conclusion belief has not been held immediately before t (say, at 
time t — Ate, where Ate is the constant causation time period from the last 
chapters). This causal approach to inference and reasons perfectly matches the 
externalist conception of justification that we are going to develop in part II, 
but it is weaker than its typical internalist counterparts. However, the notion 
of a reason-for relation which we will outline is at least similar to the basing 
relation that is presupposed by many such theories. 



By ‘process’ we shall always mean process types; thus, A may draw 
one and the same inference at different times. Moreover, we sharply distinguish 
between inferences and arguments, the latter being linguistic entities, and not 
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mental entities.* Processes are also different from mere state-transitions in the 
sense that two different processes may lead to the same state-transition. When 
we speak of a ‘process’, we also have to take into account in which way a state- 
transition has been achieved. This is achieved by our defining processes as sets 
of trajectories, i.e., sets of sequences of parameter-settings of A, s.t. a process 
takes place in A from time t up to time t At, if the sequence of parameter- 
settings that A passes through from time t up to time t At contains some 
member of the process as its subsequence; put differently: a process takes place 
in A within the sequence of ^’s parameter-settings between t and t + At, if, by 
passing through the latter sequence, A also passes through a member sequence 
of the process. Note that a sequence of parameter-settings may also have unit 
length, i.e., consist of just one parameter-setting: in such a case we can identify 
the sequence with the parameter-setting. Therefore, the class of A's states is a 
subclass of the class of A’s processes. We say that a process leads from a belief 
(or total belief) to a further belief, if every trajectory that is a member of the 
process starts in the former belief and ends up in the latter belief; synonymously 
we say that the process leads to or brings about a state-transition from the one 
belief (or total belief) to the other. Each of these notions (deads to’, etc.) is 
stated more precisely at the end of chapter 21. For “practical” reasons, let us 
only focus on finite trajectories of A. Processes are ontologically “prior” to 
state-transitions, since the former bring about the latter, but not the other 
way round. But, as we will see below, whether a mental process is an inference 
or not, only depends on what state-transition is brought about by the process, 
and on whether the process has been based on an agent’s general belief. It is 
also shown at the end of chapter 21 that if a belief (or total belief) causes 
another belief then a certain process leads from the former belief to the latter 
belief. Inference processes are precisely such processes that lead from a belief 
(or total belief) to a further belief, but not every process that leads from a 
belief (or total belief) to a further belief is also an inference process. 

Processes are sometimes subsumed under the more general class of 
events, s.t. processes are events with a positive temporal extension. However, 
instead of regarding an inference as a process, it is sometimes also regarded 
as an event without any positive temporal extension: the event of creating the 
conclusion belief on the basis of the premise beliefs (such an event could in turn 
be regarded as a state). This latter view on inferences is supported by the fact 
that such idioms as ‘being half-way through an inference’ seem “at the very 
best strained”, as Armstrong[8], p.l97, points out; and he (justifiedly) adds: 
“As soon as we are in a position to say ‘I infer’ ... we are in a position to say 
‘I have inferred’ (but in Armstrong[9], p.94, he refers to inferring again as a 

*Goldman[69], p.4, emphasizes this distinction by claiming that . . epistemology is inter- 
ested in inference, it is not (primarily) interested in inferences construed as argument forms. 
Rather, it is interested in inferences as processes of belief formation or belief revision, as 
sequences of psychological states.” 
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“psychological process”). A similar remark is made by Ryle (see, e.g., [139], 
pp. 301-303, and [140], pp.lGGf). But this “strainedness” of our speaking of 
inferences in terms of processes is maybe just a consequence of the fact that 
we are usually only consciously aware of the results of our inference processes, 
and not of the processes themselves. Indeed, when I say ‘I infer’ then I am 
also in a position to say ‘I have inferred’, but this only entails that we use 
inference ascriptions not until the corresponding inference has actually been 
finished. Later, we will define inference ascription in precisely this way, but we 
nevertheless suggest to regard inferences as processes: those processes that have 
traditionally been regarded as the mental counterparts to logical derivations. 
But the appropriate mental counterparts to logical derivations are definitely not 
events with zero temporal extension. Indeed, according to our theory, inferences 
are even processes that take at minimum a fixed time duration Aty. however, 
this is just because we assume for the sake of theoretical simplicity that every 
instance of direct causation and sustaining lasts for precisely such an amount 
of time. Moreover, since we presume a discrete flow of time, we would not even 
be able to define events with “point length” . The best substitute is a sequence 
of lenght one - sets of such sequences are, in our theory, states, and indeed also 
processes. 



We will need ways of expressing that A is disposed to draw an in- 
ference, or that A is actually drawing an inference, in the object language, 
and we suggest to employ conditional sentences for this purpose. More usually, 
inferences are represented linguistically by means of arguments of the form 



ai 



where ai, . . . , are the premises, and (3 is the conclusion. But when we say 
that A draws an inference, we say something about the cognitive agent A, and 
that is why we have to use descriptive sentences for this task and not arguments. 
Since we take such descriptive sentences to be conditionals of the form a ^ /3 
or q; => /?, the premises are now part of the antecedent a of the conditional, and 
the conclusion is the consequent /3 of the conditional. Moreover, for simplicity, 
we restrict ourselves to the case of just one premise a. a may of course be 
thought of as the conjunction of the premises ai, . . . , o;n, although it should be 
noted that this already includes some sort of (mild) epistemic presupposition. 



Finally, we assume the following relation between A’s belief in the truth 
of a conjunction formula, and A’s belief in the truth of the conjuncts: 
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• (Assumption concerning A’s Conjunction Beliefs) 

For all a, /3 G for all parameter-settings s: 
a s\= B {a A P) then s N Ba and s 1= Bp. 

4.2 A Sketch of Our Theory of Inference 

An inference is a mental process in A that leads from one belief state (or 
total belief state) to a further belief state. But it is easy to see that not every 
process that brings about a state-transition from one belief to another is an 
inference: e.g., if I cause A to believe that [Bird{a) is true] by external electrode 
stimulation, and if I afterwards cause A to believe that [Bird{a) V Penguin{a) 
is true] by external means of the same kind, then A has experienced a state- 
transition from the belief in Bird{a) to the belief in Bird{a) V Penguin{a), but 
one would not call the process that has led to that state-transition an inference. 
In order to count as an inference, the belief in Bird{a) should rather have caused 
the belief in Bird{a) V Penguin{a). But, as Armstrong[8] has observed, this 
is still only a necessary condition for inference: e.g. (actually, this is Moore’s 
example which is only used by Armstrong) I may think that there is somebody 
in the house, which causes me to open a door, which - as I can see no one in the 
house - causes me to believe that there is nobody in the house. In such a case 
we would not speak of an inference, although one belief has caused another. 

In Armstrong[8], pp. 193-200, the following improved account of infer- 
ence is suggested (or, rather, Armstrong’s characterisation may be put in this 
way): 

Definition 15 (Explication of Inference Ascription - To Be Abandoned) 

A directly infers P from a if and only if 

1. A^s belief that [a is true] causes A to acquire the belief that [P is true], 

2. A ’s belief that ]a is true] is causally active right up to the time that A 
acquires the belief that ]P is true], 

3. A ’s belief that ]a is true] is the only belief, which is causally relevant right 
up to the moment of A ’5 arriving at the belief that ]p is true]. 

A infers P from a, if and only if there are sentences a = 71 , 72, . . . , 7/c-i , 
7 /c = P, s.t. for all i ^ {1, . . . , k — 1} : A directly infers 'ji^i from 7 ^. 

Armstrong [9] himself abandons this “greatly oversimplified account” 
(p.94) of inference for two reasons^: 

An [9], Armstrong no longer distinguishes between direct and indirect inference but only 
deals with inference simpliciter; however, it is clear that he concentrates in [9] on what he 
has called ‘direct inference’ in [8]. 
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(i) If A infers (3 from a, then A’s belief that [a is true] does not only 
create the belief that [/? is true], but the belief that [a is true] also causally 
sustains^ the belief that [P is true], where ‘x causally sustains y’ means that x 
contributes to the maintaining of y in existence for some amount of time without 
necessarily causing y. Armstrong[9], p.80, characterizes causal sustaining by the 
following example: “Pillars may sustain a roof. The pillars’ presence underneath 
the roof is one state of affairs. It sustains, that is, maintains in existence another 
state of affairs: the roof’s staying up. Nothing need be happening.’’^ The pillars’ 
presence sustains the roof’s staying up, but it does not cause it; something 
else must have caused it before. Moreover, pillars are disposed to sustain the 
roof’s staying up given a roof above them. In a similar way, (i) the (presence of 
the) premise belief of an inference sustains the ( “staying up” of the) conclusion 
belief for a certain amount of time, and (ii) just as in the case of causation, 
sustaining entails a certain dispositional state; we are going to define sustaining 
more precisely below. ^ Furthermore, we will see below that inferences involve 
the premise beliefs’ being reasons for the conclusion beliefs; but if a belief is a 
reason for a further belief, the former belief sustains the latter belief if both of 
them are present. Finally we will find that if A infers /3 from a, this implies 
that A has a certain general dispositional belief; due to the latter belief, A is 
not only disposed to create the belief that [/? is true] given the belief that [a 
is true], but she is also disposed to sustain the belief that [/? is true] given the 
belief that [a is true] {and - this is entailed by the term ‘sustain’ - the then 
present belief that [/? is true]). 

(ii) Even if the term ‘causes’ of def.l5 is replaced by the term ‘causes 
and causally sustains’, then the premise belief’s causing and causally sustaining 
the conclusion belief is still only a necessary but not a sufficient condition 
for being an inference. This is shown by Armstrong[9], pp. 82-85, by stating 
examples in which the belief that [a is true] causes and causally sustains the 
belief that [P is true] in some deviant way. E.g. consider the following situation 
(this is a variation of Armstrong’s [9] example on p.83): A believes that [a 
is true] and this causes A also to believe that [/? is true], since the belief in a 
typically causes the belief in /?, if not too much introspective attention is drawn 
to the processes subserving this causation. But actually, P is rather dubious, 
and if A were to scrutinize it carefully, she would soon cease to believe it. 
Worse, A would indeed scrutinize it carefully, and so reject it, but she does not 
because she is totally preoccupied with a. The proposition expressed by a is 
very exciting for A, and while A believes it she is busy with its implications and 
has no time for thinking about p. So A’s belief in a both causes and causally 

^ Actually, Armstrong[9] uses the term ‘weakly causally sustains’ where we just use the 
more concise ‘causally sustains’. 

§A similar, though not perfectly identical point has been made by Audi[12], pp. 154-157. 
Where Armstrong speaks of a ‘created belief’ (i.e., created by the inference), Audi speaks of 
a ‘reasoned belief’ or ‘episodically inferential belief’; where Armstrong calls a belief ‘causally 
sustained’, Audi calls it a ‘belief for a reason’ or a ‘structurally inferential belief’. 
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sustains A’s belief in j3, and A is even disposed to create/sustain the belief that 
[j3 is true] given the belief that [a is true], but nevertheless we would not say 
that A infers f3 from a. 

Due to these problems, Armstrong suggests in [9] to explicate the no- 
tion of an inference (or of inferring) by first explicating the notion of the reason- 
for relation. If an inference takes place within an agent, s.t. the inference is from 
a certain premise belief to a certain conclusion belief, the premise belief does 
not only cause and sustain the conclusion belief, but the former belief is also 
the agent’s reason for having the latter belief, i.e., the conclusion belief is in- 
ferred from the premise belief in virtue of their contents (this has obviously not 
been the case in the last example) . But this should not be miscomprehended as 
the presumption that inferences are necessarily meant to be “justified”, “cor- 
rect” , “valid” , or “rational” , in some sense of these words, since otherwise we 
would no longer be able to speak of an “unjustified”, “incorrect”, “invalid”, 
or “irrational” inference, which we intuitively are indeed. If, e.g., A’s belief 
that [a is true] creates and causally sustains A’s belief that [/3 is true], and a 
logically implies /3, then A has nevertheless not necessarily drawn an inference 
from the first belief to the second, as may be seen again by thinking of some 
“deviant” instance of causation and sustaining. Moreover, it might be difficult 
to find out, even for the experienced logician, that a really logically implies 
/3, and assume that A has indeed not found out about this fact; thus there 
might be no actual reason for A to believe that a and j3 are related by logical 
implication. Rather, if A draws an inference from the belief that [a is true] to 
the belief that [/3 is true], the former belief should be a reason to create and 
sustain the latter belief from the viewpoint of A; put differently: the premise 
belief should be the reason for A to create and/or sustain the conclusion belief, 
independently of whether it is an objectively good reason, and independently of 
whether this can be justified objectively or not. Thus, the agent in our example 
must rather have some third belief that “connects” the premise belief to the 
conclusion belief, s.t. the premise belief has caused the conclusion belief due 
to the presence of this third belief.^ but where the third belief is not necessarily 
true or justified. Let us now suppose that A has, at some time t, this third 
belief and that beside that A only believes that [a is true]: if, in the example 
discussed before, A’s belief that [a is true] has caused A’s belief that [P is true] 
due to the presence of this third belief this should also be the case at time t 
since the third belief is again present at time t. Thus, the additional belief has 
to be a dispositional belief, s.t. A is disposed to create and causally sustain 
the belief that [/? is true] at a time t + Ate in the circumstances that A be- 
lieves at t that [a is true] (and some other conditions are satisfied that we are 
going to ignore like: A intends to draw this inference, etc.). We have seen in 
the last chapter that A’s dispositional beliefs are normally general beliefs, and 
that if A has a general belief, then A is disposed to draw an inference where 
the antecedent of the general sentence that expresses the content of the general 
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belief expresses just the content of the premise belief (after replacing x by a) 
and similarly for the consequent and the content of the conclusion belief (after 
the same replacement). So we will define only those mental processes as infer- 
ences which lead to state-transitions between beliefs and which are additionally 
accompanied by corresponding general dispositional beliefs^ or, more briefiy, by 
corresponding general beliefs; recall that we have seen that every general belief 
of ^ is a general dispositional belief and vice versa. These general beliefs relate 
- from the viewpoint oi A - the premise beliefs and the conclusion beliefs with 
respect to their contents. But the latter general dispositional beliefs should not 
be mistaken for further premise beliefs of the inferences considered. In terms of 
an analogy: the general dispositional belief on which an inference is based cor- 
responds to a rule of inference by which a conclusion formula is inferred from 
a set of premise formulas; but rules of inference are not formulas themselves, 
and, in particular, they are not themselves premise formulas.The usual notori- 
ous regress known as “Lewis Carrol’s Paradox” is also avoided in this way. The 
agent A may of course also have an occurrent general belief with the same con- 
tent as the dispositional general belief on which her inference is based, but this 
occurrent belief - which is by our explication in the last chapter a superstate 
of A’s dispositional general belief - is not necessary for A’s inference. 

We will therefore start our explication of the notion of inference (roughly 
in the line of [9], chapter 6) with the general dispositional beliefs that are asso- 
ciated with both reasons and inferences, and we will focus on the dispositions 
to cause and sustain beliefs that they involve. 

Let us consider some examples; assume that: (i) A draws an inference 
from her belief that there is a bird right in front of her to the belief that there 
is something right in front of her that is a bird or a penguin; (ii) A draws an 
inference from her belief that there is a penguin right in front of her to the 
belief that there is bird right in front of her; (iii) A draws an inference from 
her total belief that there is a bird right in front of her to the belief that there 
is something right in front of her that is able to fiy; (iv) A draws an inference 
from her total belief that there is a bird right in front of her to the belief that 
there is something right in front of her that is not able to fiy. 

The inferences in the examples might e.g. have been based on the 
following general dispositional beliefs: (i) A's general, dispositional, universal 
belief that [Bird{x) — > Bird{x) V Penguin{x) is true]; (ii) A’’s general, disposi- 
tional, universal belief that [Penguin(x) — > Bird{x) is true]; (iii) A's general, 
dispositional, normic belief that [Bird{x) =>nor CanFly{x) is true]; (iv) A's 
general, dispositional, normic belief that [Bird{x) ^nor ^CanFly(x) is true]. 
In the case of (iv), we would say that A's general belief is false, and that her 
inference has therefore been based on a false general belief. Inferences may of 
course also be based on general beliefs the contents of which are only express- 
ible by means of defeasible sentences: (iii) and (iv) are examples of such a kind. 
If this is the case, then the corresponding general belief is a dispositional belief. 
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s.t. the agent is disposed to cause a conclusion belief whenever a certain total 
premise belief is held (cf. our discussion of general beliefs on p.48); this latter 
total belief is then the agent’s reason for having the conclusion belief. In (i) and 
(ii), the premise belief that is the agent’s reason for having the conclusion belief 
is not total. Independently of whether the general belief that is associated with 
an inference is strict or defeasible, it is not only a dispositional belief, s.t. the 
agent is disposed to cause the conclusion belief, but it is also a dispositional 
belief, s.t. the agent is disposed to sustain the conclusion belief: e.g., in (ii), 
if the agent has the general belief that [Penguin{x) Bird{x) is true], the 
singular belief that [Penguin{a) is true] and the singular belief that [Bird{a) is 
true], then the agent’s belief that [Penguin{a) is true] sustains her belief that 
[Bird{a) is true] due to her belief that [Penguin{x) — > Bird{x) is true], since 
the agent’s belief that [Penguin{a) is true] is a reason for having the belief that 
[Bird{a) is true] due to the presence of her belief that [Penguin{x) — > Bird{x) 
is true] . Otherwise we would not say that the agent really has the general belief 
the content of which is the proposition expressed by Penguin{x) — ^ Bird{x). 
The same holds for the other examples. 

Since the premise beliefs of inferences cause their conclusion beliefs, 
they are causally active, and thus they are occurrent beliefs. Indeed, we will 
even restrict ourselves in this chapter to only those cases of inferences where 
the the premise belief and the conclusion belief are occurrent and singular. This 
excludes any discussion of the multifarious and notoriously difficult field of in- 
ductive inference if ‘inductive’ is understood according to the “ancient” (and 
rather outdated) philosophical tradition as being “from the specific to the gen- 
eral”. However, if ‘inductive’ is understood more broadly as being synonymous 
to ‘inductively strong’ (cf. Skyrms[161], p.7), i.e., where (i) it is improbable 
that the conclusion of an inductively strong inference is false while its premises 
are true, and (ii) the inference is not deductively valid, we will indeed also deal 
with a subclass of inductive inferences in this chapter. Moreover, also some in- 
stances of the (after Peirce) so-called abductive inferences from a general belief 
plus singular evidence to a singular hypothesis may be subsumed under the 
framework that we are going to present, and the same holds for some special 
cases of the closely related “inferences to the best explanation”. In part III, 
section 8.1, we will even consider inferences that lead from singular occurrent 
beliefs to general beliefs, but we will not call such processes ‘(inductive) infer- 
ences’ but rather ‘(inductive) reasoning processes’, which are, however, not the 
subject of this chapter. The typical cases of inferences which the theory to be 
stated now is tailor-made for, are on the one hand special cases of deductive 
inferences and on the other hand cases of nonmonotonic inferences where high 
probability and normic inferences are included; in each case both the premise 
belief and the conclusion belief are occurrent and singular. Thus, the premise 
beliefs of such inferences are, according to the last chapter, either (perhaps 
total) perceptual beliefs or (perhaps total) occurrent central state beliefs; the 
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conclusion beliefs of inferences are necessarily occurrent central state beliefs, 
since they are caused by the premise beliefs and we have assumed in the last 
chapter that perceptual beliefs are only determined from the “outside” and 
therefore cannot be caused by inferences. The class of inferences which we are 
going to deal with is thus still large enough to be of theoretical interest, but at 
the same time it is sufficiently restricted in order to be explicable formally as 
cognitive processes of a special kind. 

In order to simplify our notation we will use the operators B, AB, b, 
and tb, where we should actually use either the operators B^, AB^^ 6^, and tb^ 
or AB^'^ ^ 6^’®, and tb^^^ for ascribing premise beliefs, and where we should 

use only the latter for ascribing conclusion beliefs. The theory of inference which 
we are going to develop is claimed to be valid for both perceptual premise beliefs 
and occurrent, singular, central state premise beliefs. We do not have to use 
the superscript ‘o’ together with 5, AB^ 6, and t6, since we will only consider 
inferences from beliefs to beliefs, where the contents of the beliefs are expressed 
by sentences in £, but, as we have assumed in the last chapter, such beliefs are 
occurrent beliefs. 

Similar to Armstrong[8], and contrary to Armstrong[9], we will distin- 
guish direct from indirect inferences, and - after focusing on A’s dispositions to 
cause and sustain belief - we will first deal with direct inference, i.e., inference 
within one step, and we will then define the notion of indirect inference on the 
basis of the first definition. 



4.3 Dispositions to Change to Beliefs and to Remain in Beliefs 

For the rest of this chapter, let s be an arbitrary parameter-setting of A. 

In the chapters 2 and 3 we have already added a binary causation 
predicate Causes to our vocabulary, and we have defined causation by beliefs 
or total beliefs relative to an undefined predicate ‘is disposed to change to’; the 
latter predicate is defined in def.187 in the appendix. Let us now introduce a 
binary predicate Sustains, which expresses the causal sustaining relation, and 
which we also add to our vocabulary: 

• (Notation for Sustaining of A’s Beliefs by A’s Beliefs) 

s 1= Sustains{ba, 6/3) if and only if A’s belief that [a is true] sustains in s 
A’s belief that [/3 is true]. 

We use again the same fixed period Ate > 0 of time as in the case of 
causation by beliefs, when we say that A’s belief that [a is true] sustains in s 
A’s belief that [/3 is true] if and only if: (i) A believes in s that [a is true], (ii) A 
believes in s that [/3 is true], (iii) the system A is disposed in s to remain (for 
the amount Ate of time) in the belief that [/3 is true] given the belief that [a is 
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true] and (iv) given that the perceptual input to A is constant for the amount 
Ate of time. 

• (Notation for Sustaining of ^’s Beliefs by ^’s Total Beliefs) 

s 1= Sustains{tba^ b/3) if and only if A^s total belief that [a is true] sustains 
in s A^s belief that [/? is true]. 

We say that A^s total belief that [a is true] sustains in 5 A’s belief that 
[/3 is true] if and only if: (i) all that A believes in s is that [a is true], (ii) A 
believes in s that [/3 is true], (hi) the system A is disposed in s to remain (for 
the amount Ate of time) in the belief that [P is true] given the total belief that 
[a is true] and (iv) given that the perceptual input to A is constant for the 
amount Ate of time. 

In both cases, ‘remaining in’ refers to (a preservation property of) a 
state-transition and is much weaker than ‘sustaining’ for not entailing a dis- 
positional state (‘being disposed to remain in’ is defined in def.187 in chapter 
21). We will use the predicate Sustains only to express sustaining for the time 
period Ate but not for different amounts of time. 

A belief or total belief may only sustain another belief, if the latter 
belief is already “there” ; this is in contrast to the causation of a belief by a belief 
or total belief. Moreover, contrary to causation sentences, sustaining sentences 
like Sustains{ba,bP) or Sustains{tba,bP) are satisfied by single parameter- 
settings, since A’s “cognitive history” is irrelevant for sustaining. A’s “cognitive 
future’^ is indeed relevant, but this is already expressed by the disposition 
ascription. 

So we have: 

Definition 16 (Sustaining of A^s Beliefs by A’s Beliefs) 
s 1= Sustains{ba, bP) iff 

1. 5 N Ba 

2. s\=Bp 

3. A is disposed in s to remain (for the amount Ate of time) in the belief 
that [P is true] given the belief that [a is true] 

( and given that the perceptual input to A is constant for the amount Ate 
of time). 

Analogously, for sustaining by total beliefs: 

Definition 17 (Sustaining of A' s Beliefs by A’s Total Beliefs) 
s 1= Sustains{tba^ bp) iff 



1. s \= ABa 
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s\= BP 

3. A is disposed in s to remain (for the amount Ate of time) in the belief 
that [P is true] given the total belief that [a is true] 

( and given that the perceptual input to A is constant for the amount Ate 
of time). 

If a central state belief is sustained by a total central state belief - 
in this case add the index ‘c, o’ to the operators in def.17 - this has a special 
consequence: 

Corollary 18 

If s Sustains{tb^'^a^b^'^ P), then A’s central state belief that ]P is 

true] is a substate of A ’s central state belief that ]a is true]. 

This is entailed by 1 and 2 in the definition of sustaining by total beliefs 
above, and by the explication of total belief ascription in section 3.2.^ 

Now we will introduce an object language notation for A’s being dis- 
posed to change to and to remain in beliefs: let -^dis and =>dis be two (fixed) 
binary connectives, i.e., ^^dis' , '^dis' are metalinguistic individual constants. 
Let be the set of formulas of the form a -^dis P where a, p e C; anal- 
ogously for ^dis is used to express monotonic disposition ascriptions, 

^dis is used to express nonmonotonic disposition ascriptions: 

Definition 19 (Ascription of A ’s Dispositions to Change to Beliefs and to 
Remain in Beliefs) 

For all a, p e C: 

1. s\= a ^dis P iff 

(a) A is disposed in s to change (after the amount Ate of time) to the 
belief that ]P is true] given the belief that ]a is true] 

( and given that the perceptual input to A is constant for the amount 
Ate of time ) 

(b) A is disposed in s to remain (for the amount Ate of time) in the 
belief that ]P is true] given the belief that ]a is true] 

( and given that the perceptual input to A is constant for the amount 
Ate of time ) 



^But note that if we added indices differently, and assumed that s Sustains{tbPa, 
then a corresponding proposition could no longer be derived. 
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2. a ^dis P iff 

(a) A is disposed in s to change ( after the amount Ate of time ) to the 
belief that [f3 is true] given the total belief that [a is true] 

( and given that the perceptual input to A is constant for the amount 
Ate of time ) 

(b) A is disposed in s to remain (for the amount Ate of time) in the 
belief that ]j3 is true] given the total belief that ]a is true] 

( and given that the perceptual input to A is constant for the amount 
Ate of time). 

When we say that (i) A is disposed in s to change to and remain in 
the belief that [(3 is true] given the belief that [a is true], we do not want to 
say that, whenever (ii) A additionally believes in s that [a is true], it follows: A 
also actually draws an inference from a to j3. Rather, if (i) and (ii) are the case, 
then ^’s belief that [a is true] causes and sustains A^s belief that [/3 is true] 
(see corollary 21 below); but this is still not sufficient for an inference (recall 
Armstrong’s example from above). Below, we will define direct inferences in the 
way that only if (i) and (ii) are the case, and additionally (iii) A has in s the 
general belief that [a[x] P[x] is true] is the case for some strict conditional 
sign A indeed draws a direct monotonic inference from a to P (and similarly 
for general beliefs having propositions expressed by defeasible conditionals as 
their contents, and direct nonmonotonic inferences - see below). 

Ascriptions involving ^dis (and analogously for =>dis) are meant to 
entail existential claims of the following sort (these claims can even be proved 
using the formal machinery of the first chapter of the appendix): there is a 
process i s.t., (i) i leads to a state-transition from the belief that [a is true] to 
the belief that [/? is true], and (ii) A is disposed to initiate i in s. 

By the assumption on the conjunction of beliefs on p.58, we further- 
more have for all parameter-settings s that s N B(a A 7) implies that s Ba; 
on the other hand, s \= AB{a A 7) does not necessarily imply that s 1= ABa. 
That is the reason why the following corollary holds (it can be derived from 
def.19 above and def.187 in chapter 21): 

Corollary 20 

1. If s \= a ^dis P, then necessarily also s N a A 7 ^dis P 

2. i/ s N a ^dis P, then not necessarily s o A 7 ^dis P- 

E.g., A could be (monotonically) disposed to change to and remain in 
the belief that there is a bird if given the belief that there is a penguin. In this 
case, A would also be disposed to change to and remain in the belief that there 
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is a bird if given the belief that there is a penguin, which is dead. At the same 
time, A might be be disposed to change to and remain in the belief in that 
there is something which is able to fly if given the total belief that there is a 
bird. But this does not necessarily entail that A is also disposed to change to 
and remain in the belief that there is something which is able to fly if given 
the total belief that there is a bird which is a penguin (the usual dispositions 
of human beings are obvious counterexamples). By ‘not necessarily’ we simply 
mean that the corresponding condition is not the case for all agents A that 
satisfy our presumptions; analogously for all subsequent instances of the ex- 
pression ‘not necessarily’. Corollary 20 justifles the terminology of ‘monotonic’ 
vs. ‘nonmonotonic’ dispositions of changing/remaining. 

Def.l9 also directly implies (by the meaning of Causes^ Sustains; this 
can be proved thoroughly by making use again of the deflnitions stated in 
chapter 21, in particular def.196): 

Corollary 21 

For all trajectories {s{t))^^j^: 

1. if s{t) N a -^dis P then: 

(a) if s{t) N Ba, then 

i. t + Ate, {^{t))tein ^ Causesfba, bp) 
a. s{t -t- Ate) N Bp 

(b) if s{t) 1= Ba and s{t) N BP, then s{t) N Sustains{ba,bP) 

2. if s{t) 1= a ^dis P then: 

(a) if s{t) 1= ABa, then 

i. t-\- Ate, {^{t))i^in ^ Causes{tba, bP) 
a. s{t + Ate) N Bp 

(b) if s{t) N ABa and s{t) N BP, then s{t) N Sustains{tba,bp) . 

If we put the right hand sides of the major if-then clauses in words, we 
have: if A believes at t that [a is true], then A’s belief that [a is true] causes 
at t + Ate A’s belief that [p is true]; if A believes at t that [a is true] and also 
that [p is true], then A’s belief that [a is true] causally sustains at t A’s belief 
that [P is true]; if all that A believes at t is that [a is true], then A’s total belief 
that [a is true] causes at t-\- Ate A’s belief that [P is true]; if all that A believes 
at t is that [a is true], and if additionally A believes at t that [P is true], then 
A’s total belief that [a is true] causally sustains at t A’s belief that [/? is true]. 

Corollary 21 shows why we have to quantify over the possible trajec- 
tories for A rather than over arbitrary sequences of parameter-settings, since 
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if a ^dis P or a =^dis P are satisfied by a parameter-setting, this has conse- 
quences for the way in which the system A evolves (since the satisfaction of 
ct -^dis P or a =^dis P has consequences on ^’s dispositional state), and thus 
the dynamics of A is relevant. Moreover, the corollary indicates that it is a 
consequence of our definitions of Causes and Sustains that ‘being disposed to 
cause’ is actually synonymous to ‘being disposed to change to’; the same holds 
for ‘being disposed to sustain’ and ‘being disposed to remain in’. But ‘causes’ 
is of course not synonymous to ‘changes to’, and ‘sustains’ is not synonymous 
to ‘remains in’. 

By making use of the distinction between strict general sentences (of 
the form a[x] P[x]) and defeasible general sentences (of the form a[x] 
P[x]), we can now characterize A^s dispositional beliefs more precisely, and we 
can state the equivalence of ^’s having dispositional&general beliefs and ^’s 
having general beliefs that we have referred to on p.52 more precisely. For the 
reasons given in item (i) on p.48, we suggest that whether the presence of a 
general belief entails the presence of a dispositional belief, which in turn entails 
the presence of a disposition to change to and remain in a belief monotonically^ 
or whether it entails the presence of a disposition to change to and remain in a 
belief nonmonotonically, is determined by the content of the general belief. If 
the content of the general belief is expressed by a strict general sentence, A will 
be disposed to change to and remain in monotonically; this is also part of Arm- 
strong’s theory of beliefs. If the content of the general belief is expressed by a 
defeasible general sentence, A will be disposed to change to and remain in non- 
monotonically; this is a supplement that we add to the usual folk-psychological 
account of belief: 

let be some binary connectives that we might consider; 

• (A Property of General and Dispositional Beliefs Reconsidered) 

For all strict general sentences a[x] — ^ /3[x], 

for all defeasible general sentences a[x] p[x]: 

1. 5 N B{a[x] P[x\) iff 5 N B^{a[x] — ^ P[x\) 

2. s N B{a[x] ^ P[x\) iff 5 N B^{a[x] P[x\). 

• (A Further Property of General and Dispositional Beliefs Reconsidered) 

For all strict general sentences a[x] /3[x], 

for all defeasible general sentences a[x] p[x]\ 

1. if 5 1= B^{a[x] P[x\) then s N a[a] ^dis P[o] 

2. if 5 1= B^{a[x] P[x]) then s N a[a] =^dis P[o]- 

The other directions are not necessarily the case. 
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4.4 Direct Reasons and Inferences 

For a, P e jC, we shall need a formal way of saying that A^s (perhaps total) 
belief that [a is true] is her direct reason in s{t) for believing that [(3 is true]; 
but by this we do not want to entail that the belief that [a is true] is A’s only 
direct reason in s{t) for believing that [(3 is true], i.e., there might be other 
reasons, too. When we say that A’s belief or total belief that [a is true] is 
A^s direct reason in s for believing that [f3 is true], this is not meant to entail 
necessarily that A also believes that the (total) belief that [a is true] is her 
direct reason in s for believing that [/? is true], i.e. A might be totally ignorant 
of what her direct reasons are for believing something. 

Moreover, if we say that A^s (perhaps total) belief that [a is true] is her 
direct reason in s{t) for believing that [/3 is true], this should be taken as being 
synonymous to: that A believes that [a is true] is a direct reason in s{t) for the 
fact that A believes that [/3 is true] . We emphasize this because in the first case 
‘reason for’ seems to be a predicate of belief states, whereas in the second case 
we might consider ‘reason for’ either as a predicate of states of affairs, or as an 
operator that is syntactically applied to sentences expressing belief ascriptions. 
We prefer the latter interpretation for the following reason: assume that A’s 
belief that [a is true] is her direct reason in s{t) for believing that [f3 is true], 
where ‘reason for’ is a predicate of belief states; according to def.22 below, this 
will be seen to involve that A believes in s{t — Ate) that [a [a:] ^ P[x] is true] 
(for some conditional a[x] — ^ P[x]). It might now be the case that A’s belief 
that [a is true] is identical to another belief state of A, say, A’s belief that [7 
is true], since whenever A believes that [ce is true], she also believes that [7 is 
true], and vice versa. Thus, by the identity of the belief states, it would follow 
that also A’s belief that [7 is true] is her direct reason in s{t) for believing that 
[P is true], although it might perhaps be the case that A does not believe in 
s{t — Ate) that 7[x] ^ P[x\. This problem can be avoided by using a reason- for 
operator. Whenever we seem to use ‘reason for’ as a predicate of belief states, 
this should be silently translated into the operator idiom. 

Instead of regarding the (total) belief that [a is true] as a reason for the 
belief that [/? is true], and instead of using an operator ascription of reasons, 
we might - perhaps even more appropriately - have called the proposition that 
[a is true] a direct reason for the belief that [p is true], s.t., ‘reason for’ is now 
a predicate of propositions. Some examples from ordinary language even seem 
to be handled more easily by considering the content of the former belief as 
the first relatum of the reason- for relation, and not the belief itself (this is e.g. 
the way that reasons are treated by Armstrong; see Armstrong[9], pp. 78-79). 
The diflPerence between total beliefs being reasons on the one hand, and beliefs 
being reasons on the other, might be done justice by distinguishing two kinds 
of reason-for relations: one for total reasons, and one for reasons simpliciter 
(and indeed we draw such a distinction below). But since we suggest a purely 
causal approach to reasons, and since propositions do not have any causal 
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consequences, we do not follow Armstrong’s line of thought here, and speak 
of beliefs as the reasons for other beliefs, and we use a reason- for operator to 
express this. This is also going to simplify our account of justified inference in 
part II. 

Since we regard reasons as playing a part in the causation or sustaining 
of what they are reasons for, it is actually not fully correct to say that the (total) 
belief that [a is true] is A’s direct reason in s{t) for believing that [/? is true], 
without any reference to the way in which the reason plays its causal role; there 
might be two different trajectories for the system A, s.t. both trajectories lead 
at t to the parameter-setting s{t)^ but where in the first trajectory, the (total) 
belief that [a is true] has been A’s direct reason for believing that [/3 is true] in 
s{t), but where this is not the case in the second trajectory: thus, we have to 
regard reason- for ascriptions as being satisfied not by single parameter-settings, 
but by trajectories {s{t — Ate), . . • , s(t)) of parameter-settings. 

As indicated above, a (direct) inference from a[a] to /3[a] involves the 
belief or total belief that [a[a] is true] being a (direct) reason for believing 
that /3[a] is true, and in turn, the latter involves the agent’s having a general 
belief; thus also an inference from a[a] to P[a] involves having a general belief. 
The latter general belief is the general belief in the truth of a conditional 
the antecedent of which is a[x] and the consequent of which is f3[x]. If the 
conditional is strict, we speak of a conclusive reason, if it is defeasible, we 
speak of a defeasible reason. 

The distinction between conclusive reasons and defeasible reasons is 
again not part of Armstrong’s theory of inferring, but it may be found (though 
in somewhat different lines) in Pollock[125]. Pollock’s account of reasons is 
different from ours in assuming an epistemological (and normative) component, 
i.e., he regards reasons as something by which an agent may become justified in 
believing something elsell ; but we stick to a purely causal approach, and we only 
extend the latter by the notion of a defeasible reason. The causal approach has 
the advantage that it justifies our usual speaking of “good” and “bad” reasons, 
whereas reasons in the sense of Pollock can never be bad. Moreover, according 
to Pollock, reasons are propositions that do not necessarily have to be believed] 
but the reason- for relation always involves beliefs according to our account. 

Sometimes, the term ‘conclusive’ is understood in the way that if the 
belief that [a is true] is a conclusive reason for A in 5 for believing that [/3 is 
true], then a logically implies /3. But we use ‘conclusive’ in a more general way 
by which we may refer to any kind of strict general connection between a[x] 
and j3[x\ (also Armstrong [9] distinguishes conclusive reasons from deductively 
conclusive reasons; see p.77). 

So we distinguish between two notations for direct-reason-for ascrip- 
tions: 



II Pollocks theory of reasons is part of his defeasibility theory of knowledge and justification. 
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• (Notation for Direct-Reason-For Ascription) 

1. (s(t — Ate), • • • , ^(i)} ^ CR^{Ba, B(5) iff the belief that [a is true] is 
a direct conclusive reason for A in s{t) rel. to {s{t — Ate), • • • , s{t)) 
for the belief that [/? is true]; or, synonymously: that A believes that 
[a is true], is a direct conclusive reason in s{t) rel. to {s{t — Ate), • • • , 
s{t)) for A’s believing that [f3 is true] 

(where CR^ is a binary operator, which we add to our vocabulary; 
CR^ is to be applied to formulas in Cb) 

2. {s{t — Ate), • • • , s{t)) N DR^{ABa, Bf3) iff the total belief that [a is 
true] is a direct defeasible reason for A in s{t) rel. to {s{t — Ate), • • • , 
s{t)) for the belief that [/3 is true]; or, synonymously: that all that 
A believes is that [a is true], is a direct defeasible reason in s{t) rel. 
to {s{t — Ate), • • • 5 -s(t)) for A’s believing that [/? is true] 

(where DR^ is a binary operator, which we also add to our vocab- 
ulary; DR^ is to be applied to formulas in Cb,ab)^ 

Now let us turn to means of ascribing direct inference on the ob- 
ject language level: let -^infd and ^injd be two (fixed) binary connectives, 
i.e., ‘‘-^infd , ^^infd are metalinguistic individual constants. Let ^ be 

the set of formulas of the form a ^infd /? where a, (3 ^ C; analogously for 

.Since dispositions are states, sentences (like a -^dis P) ascribing dis- 
positions are made true by parameter-settings. But inferences are not states 
themselves, but processes; thus sentences ascribing inferences are not made true 
by parameter-settings, but rather by sequences of parameter-settings. Since not 
every sequence of parameter-settings of A is necessarily also a sequence that the 
system A might pass through, let us again restrict ourselves to the trajectories 
of A as the truth-makers of inference ascriptions. 

Let (so, . . . , Sn) be a trajectory of parameter-settings of A; we say: 

• (Notation for Direct Monotonic Inference Ascription) 

(so, . . . , Sn) N a -^infd P if and only if A draws in {sq,. . . , Sn) a direct 
monotonic inference from the belief that [a is true] to the belief that [p 
is true]. 

Analogously, this will sometimes be circumscribed by: A draws in 
(so, . . . , a direct monotonic inference from the proposition that [a is true] to 
the proposition that [P is true]; or: A draws in (sq, • • • , Sn) a direct monotonic 
inference from a to p. 

• (Notation for Direct Nonmonotonic Inference Ascription) 

(so, . . . , Sn) ^ Oi =^infd P if and only if A draws in (sq, . . . , Sn) a direct 
nonmonotonic inference from the total belief that [a is true] to the belief 
that [p is true]. 
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Equivalently: A draws in (sq, . . . , 5^) a direct nonmonotonic inference 
from the proposition that [a is true] to the proposition that [j3 is true]; or: A 
draws in (sq, • . . , Sn) a direct nonmonotonic inference from a to f3. 

Note that we will usually refer to parametrized sequences {s{to)^. . . ^ 
s{to + k)) in the context of inference ascription, where s{t) is the parameter- 
setting of A at t in the sequence 

Now we can define what it means to say that A draws a direct mono- 
tonic/nonmonotonic inference by first defining the direct conclusive/defeasible- 
reason-for relation: 

Definition 22 (Direct Reason- for Ascription, Direct Inference Ascription) 

Let {s{t — Ate), • • • , s{t)) be a trajectory of parameter- settings for A 
from time t — Ate to time t: 

1. (s{t - AQ, s{t)) N CR‘^{B{a[a]), B(/3[a])) iff 
there is a striet conditional a[x] — ^ l3[x\, s.t. 

(a) s{t — Ate) ^ B{a[x] /3[x]) 

(h) s{t — Ate) ^ B{a[a\) 

2. {s{t - Ate ), . . . , s{t)) N DR\AB{a[a]), B{m)) iff 
there is a defeasible conditional a[x] ^ P[x], s.t. 

(a) s{t — Ate) ^ B{a[x] => !3[x]) 

(b) s{t — Ate) ^ AB{a[a\) 

3. {s{t - Ate), • ■ • , s{t)) 1= a[a] ^i^fd (3[a\ iff 

(a) sit)^CRffBia[a]),B{/3[a])) 

(b) s{t - Ate) B{/3[a])) 

4- {s{t - Ate), ■■■, s{t)) 1= a\a] =^i„fd /3[a] iff 

(a) s{t) ^ DR^{AB{a[a\),B{j3[a])) 

(b) s{t - Ate) B{/3{a])) 

(Ate is again the fixed causation/ sustaining period). 

Ascriptions involving ^in/d (and the same holds in the case of =>infd ) 
are again meant to entail existential claims of the following sort (and these 
claims can again even be proved if the formal contents of chapter 21 are pre- 
supposed): there is a process i s.t., (i) i leads to a state-transition from the belief 
that [a is true] to the belief that [/3 is true], and (ii) A draws i in (sq, • • • , s„). 
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Def.22 together with the properties of A’s general and of A’s disposi- 
tional beliefs stated at the end of the last section, together with def.l9, and 
together with def. 7, 9, 16, 17 implies: 

Corollary 23 

1. If {s{t - Ate ), . . . , s{t)) 1= a[a] ^infd P[a] then 

(a) s{t — Ate) ^ B{a[a\) 

(b) s{t - Ate) B{l3[a\)) 

(c) there is a strict conditional a[x] ^ P[x], s.t. s{t — Ate) B^{a[x] 

P[^]) 

(d) s{t - Ate) t= ot[a] -^dis I3[a] 

(e) t, {s{t - Ate), • • • , s{t)) N Causes{b{a[a\),b{(3[a])) 

(f) s{t) N Sustains{b{a[a])^b{l3[a])) 

2, if {s{t — Ate), • • • 5 s{t)) N a[a] ^infd (3[a\ then 

(a) s{t — Ate) AB{a[a]) 

(b) s{t- Ate))^ B{(3[a])) 

(c) there is a defeasible conditional a[x] => /3[x], s.t, s{t — Ate) ^ 
B^{a[x] ^ (3[x]) 

(d) s{t - Ate) ^ (^[o] =>dis P[o] 

(e) t, {s{t — Ate), • • • 5 s{t)) N Causes{tb{a[a\), b{/3[a])) 

(f) s{t) N Sustains{tb{a[a]),b{l3[a])). 

Corollary 23 summarizes the essential properties of sequences of para- 
meter-settings in which direct monotonic or nonmonotonic inferences take place. 

If we apply corollary 20 to def.22 and the properties of general beliefs 
on p.68, we get: 

Corollary 24 

If {s{t — Ate), • • • , s{t)) t= CR^{B{a[a\), B{P[a])), then necessarily also 
s{t - Ate) i= ot[o] A 7[a] -^dis P[a\. 

E.g., if the belief that there is a penguin is a direct conclusive reason 
for A’s belief that there is a bird, it follows that A has been disposed to infer 
‘there is a bird’ monotonically from ‘there is a dead penguin’. 

Moreover, by def.22 and our assumption on p.58, we have for all a[a], 

P[a], 7 [a] e C: 
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Corollary 25 

Assume that for A’s parameter- setting s{t — Ate) at t — Ate holds 

that 

1. for some strict implication sign if s{t — Ate) B{a[x] /3[x]), then 
s{t — Ate) B{a[x] A 7 [x] — ^ P[x])** , and 

2. s{t — Ate) ^ B{a[a] A j[a]). 

Then it follows that: 

if (s(t — Ate ), . . . , s(t)) N a\a] 0\o] then also 

{s{t - Ate), ■■■, s{t)) \= a[a] A 7 ( 0 ] /?[o]. 

On the other hand, we have the following nonmonotonicity property 
for direct defeasible reasons: 

Corollary 26 

If {s{t — Ate), • • • , s{t)) N DR^{AB{a[a\), B{/3[a])), then not necessar- 
ily 

s{t - Ate) A 7[a] ^dis P[o]> 

E.g., if the total belief that there is a bird is a direct defeasible reason 
for A^s belief that there is something which can fly, it does not necessarily 
follow that A has also been disposed to infer ‘there is something which can 
fly’ nonmonotonically from ‘there is a penguin bird’ (human agents are again 
typical counterexamples) . 

Now we can define what the (i.e. A’s) direct monotonic/nonmonotonic 
inference from a[a] to j3[a] is - here we use a metalinguistic definite description 

- furthermore what a direct monotonic/nonmonotonic inference is, and finally 
what a direct inference is; in the last two cases we use metalinguistic predicates. 
We do not introduce corresponding expressions in the object language since in 
parts II, III, and IV we will not refer to justified inferences in our formal theory 
of justified inference but only in informal considerations; instead we will ascribe 
justified inferences by means of operators: 

Definition 27 (A^s Direct Monotonic and Nonmonotonic Inferences) 

1. A’s direct monotonic inf erence from a[a] to P[a] is the set of all trajecto- 
ries 

{s{t- Atc),...,s{t)), s.t: 

{s(t - Ate), ■■■, s{t)) 1 = a[a] (3[a\ 

A is “rational” in s{t — At) with respect to a[x] — ^ (3[x\ and a[x] A 7 [x] — ^ /3[a^], since 

- according to our def.3 of strict law- like sentences in chapter 2 - whenever the former is 
true, also the latter is true. 
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2. A’s direct nonmonotonic inference from a[a] to /3[a\ is the set of all tra- 
jectories {s{t — Ate), • • • 5 s{t)), s.t.: 

{s{t - Ate), • • • , s{t)) 1= a[a] ^infd P[a] 

3. Pr is a direct monotonic inference (of A) iff there are a[a], (3[a] G C, s.t. 
Pr is A’s direct monotonic inference from a[a] to j3[a\ 

4 . Pr is a direct nonmonotonic inference (of A) iff there are a[a], j3[a\ G C, 
s.t. Pr is A’s direct nonmonotonic inference from a[a] to P[a] 

5. Pr is a direct inference (of A) iff there are a[a], f3[a\ G C, s.t. either Pr 
is A’s direct monotonic inference from a[a] to P[a], or Pr is A’s direct 
nonmonotonic inference from a[a] to P[a\. 

Note that direct monotonic inferences are not necessarily disjoint from 
direct nonmonotonic inferences, since within one trajectory of parameter- settings 
more than just one inference might be drawn. 

Def.22 suggests the following terminology: we say that the direct mono- 
tonic inference from a[a] to /3[a] is based at t on the general belief that [a[x] 

P[x] is true] (for some strict implication sign — ^) iff {s{t — Ate), ^ , s{t)) 

a[a] ^infd P[a] and s{t—Atc) 1= B{a[x] — ^ P[x])‘, the direct nonmonotonic infer- 
ence from a[a] to f3[a] is based at t on the general belief that [a[x] => P[x] is true] 
(for some defeasible implication sign =^) iff {s{t — Ate), • • • 5 ^{t)) =^infd 

/3[a] and s{t — Ate) N B{a[x] P[x]). We use the same terminology for reasons; 
sometimes we say that a general belief is associated with an inference or a rea- 
son, instead of saying that the inference or the reason is based on the general 
belief. Note that, according to def.23, the direct monotonic/nonmonotonic in- 
ference from a[a] to j3[a] might be based on several general beliefs with pairwise 
different contents, but where the contents of the latter beliefs are expressed by 
conditionals which have the antecedent a[a] and the consequent j3[a\ (after re- 
placing a by x); in such a case the contents might still differ with respect to the 
implication signs of the conditionals by which they are expressed, and - this is 
the point - with respect to the semantics of these implication signs. It is even 
possible that the direct monotonic/nonmonotonic inference from a[a] to /3[a] 
is identical to, say, the direct monotonic/nonmonotonic inference from 7 [a] to 
(5 [a]: this is the case if and only if in every sequence of parameter-settings in 
which A draws the one inference, she also draws the other one. 

According to def.22 and 27, direct inferences are mental processes 
which are not necessarily conscious - just as beliefs are mental states which 
are not necessarily conscious states; as we will see below, the same holds for 
indirect inferences. 

^^Compare Schurz[149] on unconscious mental processes and nonmonotonic inference. 
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4.5 Direct Reasons and Inferences: Deductive, High Probability, 
Normic 

If the general belief associated with a direct conclusive reason or with a direct 
monotonic inference is such that its content is expressed by a universal general 
sentence, then we call the reason and the corresponding inference ‘deductive’. 
If the general belief associated with a direct defeasible reason or with a direct 
nonmonotonic inference has as its content a proposition that is expressed by 
a high probability conditional, we call the reason a ‘direct high probability 
reason’ and the inference a ‘direct high probability inference’; accordingly, for 
direct reasons or direct inferences based on general beliefs in the truth of normic 
conditionals, which we qualify as ‘direct normic reasons’ and ‘direct normic in- 
ferences’. We use the binary direct-reason- for operators 

and the binary direct inference operators -^infd , ^hpinfd , =^ninfd in order to 
state this more precisely: 

Definition 28 (Direct Deductive /High Probability /Normic Reason- for Ascrip- 
tion; Direct Deductive /High Probability /Normic Inference Ascription) 

Let {s{t — Ate ), . . . , 5(t)) be a trajectory of parameter- settings for A 
from time t — Ate to time t: 

1. (s(i-Aic),---,s(i))NC'i?l^(B(a[a]),B(/3[a])) ijf 

(a) s{t — Ate) N B{a[x] p[x]) 

(b) s{t — Ate) 1= B{a[a\) 

2. {s{t - At,), s{t)) N DRi^iAB{a[a]), B{P[a])) iff 

(a) s{t - Ate) i= B{a[x] ^hp /?N) 

(b) s(t — Ate) 1= AB{a[a\) 

3. {s{t - Ate), s{t)) DRtffAB{a[a]), B{P[a])) iff 

(a) s{t - Ate) 1= B{a[x] ^nor P[x]) 

(b) s{t — Ate) 1= AB{a[a\) 

4- {s{t - Ate), s{t)) 1= a[a] -^infd (3[a] iff 

(a) s{t)¥CRtffB{a[a\),B{l3[a])) 

(b) s{t - Ate) ^ B{(3[a])) 



5. {s{t- Ate),...,s{t)) Na[a] 

^^hpinfd m iff 

(a) s{t)^ DRt^{AB{a[a]),B{p[a])) 

(b) s{t - Ate) ^ B{!3[a])) 
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6 . {s(t - Ate), ■■■, s{t)) \= Q![a] I3[a\ iff 

(a) s{t) N DRi,,{AB{a[a]),B{l3[a])) 

(b) s{t-Ate)i^B{P[a])). 

Thus, direct deduct ive/high probability /normic inferences are based on 
universal/high probability /normic general beliefs. Formal expressions involving 
CRt^, DR^, DRtr, ^hpinf^^ ^ninf^ readjust as their analogues 

CR^, DR^, -^infd^ apart from replacing “monotonic” by “deductive”, 

and “nonmonotonic” by “high probability” or “normic” . 

Analogously to def.27, we can define: 

Definition 29 (A’s Direct Deductive, High Probability, and Normic Infer- 
ences) 

1. A’s direct deductive inference from a[a] to /3[a] is the set of all trajectories 
{s{t- Atc),...,s{t)), s.t: 

{s{t - Ate), • • . , s{t)) N a[a] (3[a\ 

2. A’s direct high probability inference from a[a] to j3[a\ is the set of all 
trajectories 

{s{t- Atc),...,s{t)), s.t: 

{s{t - Ate ), . • • , s{t)) N a[a] 

3. A's direct normic inference from a[a] to P[a] is the set of all trajectories 

{s{t - Ate), • . • s.t: 

{s{t - Ate ), . . . , s{t)) \= a[a] =^ninfd P[o] 

4 . Pr is a direct deductive inference (of A) iff there are a[a], /3[a] G C, s.t. 
Pr is A’s direct deductive inference from a[a] to l3[a] 

5. Pr is a direct high probability inference (of A) iff there are o;[a], /?[a] G C, 
s.t. Pr is A’s direct high probability inference from a[a] to (3[a] 

6. Pr is a direct normic inference (of A) iff there are a[a], /3[a] G C, s.t. Pr 
is A’s direct normic inference from a[a] to P[a\. 



A side-remark on direct deductive inferences: according to our def- 
inition, A’s direct deductive inference from, e.g., P{a) to Q{a), is a set of 
parameter-setting sequences satisfying some conditions and it is based on A’s 
general belief that [P[x] Q[x] is true]. But according to the usual terminol- 
ogy in logic, there is no deductive inference from P{a) to Q{a) at all, since 
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the corresponding argument P{a) Q{a) is not deductively, i.e., logically, valid 
(‘deductive’ is often understood as being synonymous to ‘deductively valid’). 
On the other hand, the inference from P{a) and yx{P[x] Q[x]) to Q{a) is 
deductive in the latter sense, since P(a),Vx(P[x] — > Q[x]) /. Q{a) is indeed 
logically valid, and this is the reason why we have chosen this terminology. But 
note that as far as the the latter argument is concerned, Vx(P[x] ^ Q[x]), or - 
if we suppress the universal quantifier - P[x] Q[x], is regarded as a further 
premise and thus has the same status as P{a); but, as we have emphasized 
above, A’s belief that [P[x] — > Q[x] is true] has a different role than ^4’s belief 
that [P{a) is true] concerning the cognitive direct deductive inference process 
from P{a) to Q{a): while the former belief is a dispositional belief, the latter is 
an occurrent belief. In order to keep this distinction between P[x] Q[x] and 
P{a) explicit, we speak of A^s direct deductive inference from P{a) to Q{a), 
rather than of A’s direct deductive inference from P{a) and \/x{P[x] — > Q[x]) to 
Q{a). In logical terms: ^’s general belief that [P[x] — > Q[x] is true] corresponds 
more to a rule of inference than to a further premise formula. A may of course 
at the same time also have the occurrent general belief that [P[x] — > Q[x] is 
true], but this occurrent belief - which is a superstate of A’s dispositional gen- 
eral belief that [P[x] Q[x] is true] - is not necessary for A^s direct deductive 
inference from P{a) to Q{a). Similar remarks can be made for high probability 
inferences and for normic inferences. 

Let us have a look at our cat&bird example again: 

Example 30 (The Cat&Bird Example Reconsidered) 

Let our cognitive agent A be a cat. A has att — Ate perceptual belief 
that there is a bird right in front of her, and it is A ’s intention to hunt and 
catch the bird. First of all, A watches the bird for a short while, while the bird 
is picking grains from the dirt. In this situation we would suspect that - since 
A is lacking any relevant counterevidence - A would infer at t that that the bird 
is able to fly, and that this inference partly shapes A 's plan of how to hunt the 
bird. If A indeed draws such a (direct) inference, then according to corollary 23 
we can draw the following conclusions: 

1. A has at t — Ate cl certain total perceptual belief concerning A, e.g (say) 
the total belief that [Bird{a) is true], where a denotes the bird considered 

2. A does not believe at t — Ate that [CanFly{a) is true] 

3. A has at t — Ate the general, normic, and dispositional belief that 
]Bird[x] =^nor CanFly[x] is true], i.e. that birds can normally fly 

4 . thus, A is disposed att — Ate to cause/sustain the belief that [CanFly{a) 
is true] given the total belief that [Bird{a) is true] 
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5. A’s total belief that [Bird{a) is true] causes att the belief that [CanFly{a) 
is true] 

6. A^s total belief that ]Bird{a) is true] sustains att the belief that ]CanFly{a) 
is true] 

7. A ’s total belief ]Bird{a) is true] is a direct defeasible normic reason for 
A at t for the belief that [CanFly{a) is true]. 

It is not entailed that A is also disposed at t to cause/sustain the belief 
that ]CanFly{a) is true] given the total belief at t that ]Bird{a) A Penguin{a) 
is true]. 



Here is a preview of part IV, where we will show how such deductive, 
high probability, and normic inferences may be implemented: 

Example 31 

1. (In terms of computers employing symbolic computation) 

As we will see in part IV, the (direct) deductive inference from a[a] to 
l3[a] may be implemented by symbolic computation in the following way: 

at first it is verified whether the conditional knowledge base, which is a 
data base for general sentences, contains a[x] l3[x], or whether it is 
derivable from the conditional knowledge base by the application of rules 
of inference. If this is indeed the case, it is checked whether the factual 
knowledge base, which is a data base for singular sentences, contains a[a\. 
If this is also the case, p[a] is added to the factual knowledge base. 

Moreover, the (direct) high probability /normic inf erence from a[a] to p[a] 
may be implemented by symbolic computation as follows: 

at first it is verified whether the conditional knowledge base contains 
a[x] =^hp /3[x]/q:[x] =^nor l3[x\, or whether they are derivable from the 
conditional knowledge base by the application of rules of inference. If 
this is indeed the case, it is either checked whether the conjunction of 
all singular formulas contained in the factual knowledge base, is identical 
to a[a\. If this is also the case, /3[a] is added to the factual knowledge 
base. Alternatively, and more efficiently, the conditionals in the condi- 
tional knowledge base are tried to be extended to a conditional, s.t. the 
antecedent of the conditional is logically equivalent to the conjunction of 
all singular formulas contained in the factual knowledge base. 

2. (In terms of connectionist networks employing state-transitions) 

As we will see in part IV, the (direct) deductive inf erence from a[a] to p[a] 
may be implemented by state-transitions in an artificial neural network 
in the following way: 
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the activation patterns that represent beliefs are set in such a way that 
whenever a pattern is activated which contains the pattern corresponding 
to a[a], also the pattern corresponding to /3[a] is active. 

Moreover, the (direct) high probability /normic inf erence from a[a] to P[a] 
may be implemented by state-transitions in an artificial neural network 
as follows: 

the weights of the network connections are set in such a way that whenever 
precisely the pattern is activated which corresponds to a[a] (but where no 
further node is activated), the network activity converges to a stable state 
in which the pattern corresponding to p[a] is active. 

4.6 Reasons and Inferences 

Now we extend our account of direct inference to inference in general, i.e., to 
direct or indirect inference. In the case of an ideal agent whose set of gen- 
eral beliefs is closed under all kind of logical rules, the distinction of reasons 
and direct reasons and also of inferences and direct inferences would collapse. 
Moreover, the same collapse would occur if we simply identified dispositional 
general beliefs and dispositions to change from a premise belief to a conclusion 
belief. But we do not restrict ourselves to ideal agents only, and as we have 
pointed out in sections 3.4 and 4.3, dispositional general beliefs indeed differ 
from dispositions to change from a premise belief to a conclusion belief. 

We distinguish again two kinds of reason- for relations: 
let {s{t — TiAtc), • • • , s{t)) be a trajectory of parameter-settings {nAtc 
is n times where Ate is again the fixed amount of time that we have 
presupposed both causation and causal sustaining to take or endure); we use 
the following notation: 

• (Notation for Reason-For Ascription) 

1 . {s{t — nAtc), . . . , s{t)) N CR{B{a), B{P)) iff the belief that [a is 
true] is a conclusive reason for A in s{t) rel. to {s{t — nAtc ), . . . , s{t)) 
for the belief that [/3 is true]; or, synonymously: that A believes that 
[a is true], is a conclusive reason in s{t) rel. to {s{t — nAtc), • • • 5 ^(0) 
for A’s believing that [/? is true] 

(where Ci? is a binary operator, which we add to our vocabulary; 
CR is to be applied to formulas oi Cb) 

2. {s{t — nAtc), • • • ? s(t)) N DR{AB{a), B{l3)) iff the total belief that 
[a is true] is a defeasible reason for A in s(t) rel. to {s{t — nAtc), . . . , 
s{t)) for the belief that [/? is true]; or, synonymously: that all that 
A believes is that [a is true], is a direct defeasible reason in s{t) rel. 
to {s{t — nAt), . . . ,s{t)) for A’s believing that [jS is true] 
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(where DR is a binary operator, which we also add to our vocabu- 
lary; DR is to be applied to formulas of Cb,ab)- 

Furthermore, we say: 

• (Notation for Monotonic Inference Ascription) 

{s{t — nAtc), • • • , s{t)) h a —^inf P if and only if A draws in {s{t — nAtc), 
. . . , s(t)) a monotonic inference from the belief that [a is true] to the 
belief that [j3 is true]. 

Or, synonymously: A draws in {s{t — nAtc), • • • , s{t)) a monotonic in- 
ference from the proposition that [a is true] to the proposition that [(3 is true] ; 
or: A draws in {s{t — nAtc), • • • , s{t)) a monotonic inference from a to 

/?. 



• (Notation for Nonmonotonic Inference Ascription) 

{s{t — nAtc), • • • , s{t)) N a ^.inf 13 if and only if A draws in {s{t — nAtc), 
. . . , s{t)) a nonmonotonic inference from the total belief that [a is true] 
to the belief that [/3 is true] . 

Again, synonymously: A draws in {s{t — nAtc), • • • , s{t)) a nonmono- 
tonic inference from the proposition that [a is true] to the proposition that [/3 
is true]; or: A draws in {s{t — nAtc), • . • , s{t)) a nonmonotonic inference from 
a to (3. 



So we can define what it means to say that A draws a monotonic/non- 
monotonic inference, by first defining the conclusive/defeasible reason- for as- 
cription; both kinds of notions are given by the n-fold iteration of their direct 
counterparts: 

Definition 32 (Reason- for Ascription, Inference Ascription) 

Let {s{t — nAtc ), . . . , s{t)) be a trajectory of parameter- settings for A: 

1. (s{t - nAtc), s{t)) 1= CR{B{a[a]), B(/?[a])) iff 
there are a[a] = 7o[a],7i[a], . . . ,7„[a] = /3[a] € C, s.t. 

(a) {s{t - nAtc),...,s{t- {n- l)Atc)) 1= CJ?‘^(S(7o[a]), -B(7iH)) 

(b) {s{t - (n - l)Atc), 2) Ate)) 1= CR'^{B{-n [a]), 5(72(0])) 



(c) {s{t- Atc),...,s{t)) 1= C5‘'(5(7„_i[a]),B(7„[a])) 

{s{t - nAtc ), . . . , s{t)) N DR{AB{a[a\), S(/3[a])) iff 
there are a[a] = 70 [a], 71 [a], ... ,7^ [a] = /3[a] G C, s.t. 
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(a) {s{t - nAtc), ■ ■ .,s{t-{n- 1 )AQ) ^ DR^{AB{-fo[a]), Bijila])) 

(b) {s{t -{n- l)Atc),...,s{t - (n - 2 )Atc)) N -Di?"'(^B(7i[a]), 5(72(0])) 

(c) {s{t - At,), s{t)) \= £>5^(^5(7„_i[o]), 5(7„[a])) 

3 . {s{t - nAtc), s{t)) \= a[a] -^inf f 3 [a] iff 

there are a[a\ = 70(0], 71(0], . . . ,7n[a] = / 3 [o] S C, s.t. 

(a) {s{t - nAtc), ...,s{t-{n- l)Atc)) t= 70(0] ^infd 7i[a] 

(b) {s{t - (n - l)Aic), ...,s{t-{n- 2 )Atc)) 1 = 71(0] 72(0] 



(c) (s{t- Atc),...,s{t)) N7„_i[a] ^infd 7„[a] 

4. {s{t - nAtc), ..., s{t)) N a[a] ^inf /3[a] iff 

there are a[a\ = 70 ( 0], 71 [a], . . . , 7 n[o] = /?[o] € B, s.t. 

(a) {s{t - nAtc), ...,s{t-{n- l)Atc)) N 7o[a] =^infd 7 i[o] 

(b) {s{t - (n - l)A^c), ...,s{t-{n- 2) At,)) ^ 71 [a] 72(0] 



(c) {s{t- Atc),...,s{t)) l=7„_i[a] ^infd 7„[a], 

Note that in the case of nonmonotonic inferences, each of their direct 
component inferences, except for the last one, is actually an inference from 
a total belief state to a total belief state, since each conclusion belief state - 
except for the last one - is the premise belief state of the subsequent direct 
inference. 

Monotonic/nonmonotonic inferences may be based on several general 
beliefs the contents of which are expressed by different strict /defeasible condi- 
tionals. 



Now let us define what the (i.e. A’s) monotonic/nonmonotonic infer- 
ence from a[a] to /3[a] is, what a monotonic/nonmonotonic inference, and what 
an inference is: 

Definition 33 (A’s Monotonic and Nonmonotonic Inferences) 

1. A's monotonic inference from a[a] to /3[a] is the set of all trajectories 
{s{t - nAtc), • . -,s{t)), s.t: 

{s{t - nAtc ), . . . , s{t)) N a[a] ^inf /?[a] 




Reasons and Inferences: Deductive, High Probability, Normic 



83 



2. A’s nonmonotonic inf erence from a [a] to /?[a] is the set of all trajectories 
{s{t - nAtc),. . . ,s{t)), s.t: 

{s{t - nAtf ), . . . , s{t)) N a[a] =^inf /?[a] 

3. Pr is a monotonic inference (of A) iff there are a[a], P[a] G C, s.t. Pr is 
A’s monotonic inf erence from a[a] to P[a] 

j. Pr is a nonmonotonic inference (of A) iff there are a[a], /3[a] G C, s.t. 
Pr is A’s nonmonotonic inference from a[a] to /3[a] 

5. Pr is an inference (of A) iff there are a[a], P[a] G C, s.t. either Pr is 
A^s monotonic inference from a[a] to f3[a], or Pr is A’s nonmonotonic 
inference from a [a] to P[a]. 

4.7 Reasons and Inferences: Deductive, High Probability, Normic 

We use the binary reason- for operators CRded^ DRhp, DRnor, and the bi- 
nary inference operators -^inf, '=^hpinf, ^ninf to define general deductive/high 
probability /normic reason- for /inference ascription: 

Definition 34 (Deductive /High Probability /Normic Reason-for Ascription; 
Deductive/ High Probability /Normic Inference Ascription) 

Let {s{t — nAtc), • • . , s{t)) be a trajectory of parameter- settings for A: 

1. {s{t - nAte), . . . , s{t)) N CRded{B{a[a]), B{P[a])) iff 
there are a[a] ~ 7o[a],7i[a], . . . ,7n[a] = P[a] G C, s.t. 

(a) {s{t - nAtc), . . . ,s(t - (n - l)Atc)) N B(7i[a])) 

(b) {s{t - (n - 2)At,)) N Ci?lrf(B(7i[a]), B(72[a])) 



(c) {s{t - Ate), . . \= Ci?l^(B(7„-i[a]),B(7„H)) 

{s{t - nAte), . . . , s{t)) N DRhj,{AB{a[a]), B{f3[a])) iff 
there are a[a] = 7o[a],7i[a], . . . , 7 n[a] = l3[a] G C, s.t. 

(a) {s{t-nAtc),...,s{t- {n- l)Ate)) 1= DRfj^{AB{jo[a]),B{'yi[a])) 

(b) {s{t - (n - l)Ate), ...,s(t-(n~ 2)Aff)) N DRt^{AB{j,[a]),B{j 2 [a])) 

(c) {s{t - Ate), . . N L>i?^p(AB(7„_i[a]),B(7„H)) 

3. (s(t - nAtc), . . . , s(t)) N DRnor{AB{a[a]), B{P[a])) iff 
there are a[a] = 7o[a],7i[a], . . . ,7n[a] = P[a] G C, s.t. 
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(a) {s{t-nAtc),...,s{t -{n- l)Atc)) 1 = DRf^^^{AB{'yo[a\), Bi-yila])) 

(b) {s{t-{n- l)Atc),...,s{t-{n- 2 )Atc)) 1= DRf^^^{AB{ji[a]),B{'y2[a])) 

(c) {s{t- Atc),...,s{t)) \= DR^^^{AB{'yn-i[a]),B{-fn[a])) 

4 - {s{t - nAtc), ■■■, s{t)) \= a[a] -^inj /?[a] iff 

there are a[a] = 7o[a],7i[a], . . . ,7n[«] = P[o] ^ C, s.t. 

(a) {s{t - nAtc), ...,s{t-{n- l)Atc)) t= 7oH 7i[«] 

(b) {s{t - (n - l)Atc), 2 )Atc)) N 7i[a] 72M 



(c) {s{t - Ate), ■ ■ ■,s{t)) 1= 7„_i[a] ^infd ln[a] 



5 . {s{t - nAtc), ■■■, s{t)) 1= a[a] ^hpinf P[a] iff 

there are a[a] — 7o[a],7i[a], . . . ,7n[«] == ( 3 \a\ € C, s.t. 

(a) {s{t - nAtc), ...,s{t-{n- l)Atc)) 1= 7o[a] ^hpinf^ 7i[a] 

(b) {s{t - (n - l)Atc), ...,s{t-{n- 2 )Atc)) N 71W =^hpinf<‘ 72H 



(c) {s{t - Ate), ■ ■ ■,s{t)) 1 = 7 „_i[a] =^hpinf<^ 7n[a] 

6. {s{t - nAtc), ..., s{t)) 1= a{a] =>„»„/ /3[a\ iff 

there are a[a] = 7o[a],7i[a], . . . ,7n[a] = / 3 [a] G C, s.t. 

(a) {s{t - nAtc), ...,s{t-{n- l)Atc)) N 7o[a] ^ninf^ 7i[a] 

(b) {s{t - (n - l)Atc), ...,s{t-{n- 2)Atc)) 1 = 71M 72M 

(c) {s{t- Ate),..., S{t)) l=7n-l[a] =^„m/d ln[a\. 

Finally, we have: 

Definition 35 (A's Deductive, High Probability, and Normic Inferences) 

1. A’s deductive inference from o[a] to P[a] is the set of all trajectories 
{s{t - nAtc), ■ ■ ■,s{t)), s.t.: 

{s{t - nAtc), ..., s{t)) N a[a] (3[a\ 
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2. A ’s high probability inference from a[a] to f3[a] is the set of all trajectories 
{s{t - nAtc ), . . . ,5(t)), s.t: 

{s{t - nAtc), • • . , s{t)) N a[a] =^hpinf P[o] 

3. A^s normic inference from a[a] to P[a] is the set of all trajectories 
{s{t - nAtc),. . . ,s{t)), s.t: 

{s{t - nAtc ), . . . , s{t)) a[a] ^ninf /?N 

4 . Pr is a deductive inference (of A) iff there are a[a], P[a] G C, s.t. Pr is 
A’s deductive inference from a[a] to l3[a] 

5. Pr is a high probability inference (of A) iff there are a[a], (3[a\ G C, s.t. 
Pr is A’s high probability inference from a[a] to P[a] 

6. Pr is a normic inference (of A) iff there are a[a], /3[a] G C, s.t. Pr is 
A's normic inference from a[a] to P[a\. 
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Chapter 5 

GENERAL REMARKS ON JUSTIFICATION AND JUSTIFIED 

BELIEF 



In the last part we have given an explication of the concept of inference in gen- 
eral, and of monotonic and nonmonotonic inference in particular. In this part 
we will define what it means to say that an inference (monotonic or nonmono- 
tonic) is justified^ where the premise beliefs of the inferences that we think of 
are either perceptual beliefs or occurrent central state beliefs. This first chap- 
ter of part II is devoted to some introductory remarks concerning the topic of 
justification in general, and the topic of justified belief in particular. Chapter 6 
presents our theory of justified inference informally, chapter 7 discusses the no- 
tion of reliability that is presupposed by our theory. Chapter 6 and 7 together 
motivate the formal details of chapter 8, where our theory of justified inference 
is finally stated. 

5.1 Intuitions 

Justification is defined and studied primarily as a property of beliefs and not 
of inferences. Intuitively, an agent’s belief is justified if and only if it is held 
by the agent for good reasons. Theories of justification aim at circumscribing 
the latter phrase by necessary and sufficient conditions, and the number of 
suggestions for such conditions is legion. However, there are two main families 
of such theories* , and we are now going to deal with. the characteristics of each 
of them. 

5.1.1 Internalist Justification of Beliefs 

The first family of theories consists of the so-called internalist theories of justi- 
fication. A characteristic feature of such theories is that they analyze the notion 
of justification by analysis of the phrase ‘held by the agent for good reasons’ 
in terms of three components: there is something (i) that is a reason^ (ii) the 
justified belief is held for it, and (iii) it is good. Let us discuss these parts more 
thoroughly: 

* Actually, one might also think of a third “main” family of such theories where elements 
of the two other families are combined; see Brendel[24], section 7.3, on such “mixed theories”. 
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Ad (i): reasons are either defined to be mental states, or the contents 
of mental states, i.e., propositions. Let us restrict to the first case only: thus, 
belief states are the primary candidates for reasons, since they subserve infer- 
ences (as we know from part I); perceptual states - if not regarded as belief 
states themselves anyway - are additional typical examples of reasons. Reasons 
are supposed to be cognitively accessible (to the agent or to a subsystem of the 
agent whose belief is justified), s.t. reasons are mental entities that the agent 
may reflect on, where the agent judges them to be reasons, and where the agent 
may bring forward the reasons in favour of the justified belief by internal or 
external argumentation; in such a case we speak of an internalist theory of jus- 
tification. If the agent is furthermore supposed to be consciously aware of the 
justifying reasons and also of their being good reasons for the justified belief, 
or if ‘cognitively accessible’ is taken as synonymous to ‘consciously accessible’ 
right from the start, we may speak of a classical internalist theory of justi- 
fication: “Internalists tend to emphasize our conscious internal access to the 
relations between our beliefs. On this understanding of internalism, reflective, 
careful agents are able to make epistemological assessments of their beliefs.” 
(Pollock[125], p.26). Chisholm[34], p.5, states this as follows: “In making their 
assumptions, epistemologists presuppose that they are rational beings. This 
means, in part, that they have certain properties which are such that, if they 
ask themselves, with respect to any of these properties, whether or not they 
have that property, then it will be evident to them that they have it. It means 
further that they are able to know what they think and believe and that they 
can recognize inconsistencies.” As Chisholm points out, internalist theories of 
justification make justification come very close or even identical to (theoreti- 
cal) rationality. In this way, they can explain why we are inclined to call an 
agent’s belief justified if the agent is able to argue for it by presenting the rea- 
sons for which she holds the belief. But there are also non-classical internalist 
theories of justification: e.g. Pollock has developed an account of justification 
and rationality in terms of procedural epistemic norms which he calls ‘direct 
realism’ and which he has even complemented by computer implementations 
of what he calls “rational agents” (for an overview see Pollock[125], chapter 5- 
7; for a detailed and most recent account see Pollock[126]). Internalism might 
furthermore also be understood as a ‘third-person internalism’, i.e., where the 
reasons are not necessarily brought forward by the agent herself but rather 
by the scientific community; but we will restrict ourselves to the discussion of 
internalism just in the traditional first-person sense. 

Ad (ii): it would not be sufficient for justifiedness that there merely 
were some good reasons for the agent’s having a belief (but where the agent 
does not “have” them), and it would also be insufficient if the agent just had 
good reasons for having the belief (but where the reasons do not have any kind 
of causal influence on the belief to be justified); rather, the good reasons have to 
cause or sustain (recall our explications in part I) the agent’s belief in order to 
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justify it. “Being justified in holding a belief on a certain basis consists of your 
belief “arising out of” that basis in some appropriate way” , where the “basing 
relation is at least partly a causal relation” (Pollock[125], p.79). Thus, the good 
reasons for a justified belief are reasons for the belief both in a causal and in 
a normative sense: they cause or sustain the belief in question, and, by their 
being good, they also justify it. As far as the causal aspect is concerned, the 
“causation and sustaining” of a belief by a reason may generally be achieved in 
more than just one way, i.e., there is generally more than just one causing or 
sustaining process (type) that might lead from the reasons to what the reasons 
are reason for. 

Ad (iii): a reason is good if it is an “adaquate indication” (Moser et 
al. [112], p.77) of the truth of the justified belief. It is usually considered to 
be “good” to have a true belief, and, correspondingly, a reason for a belief is 
“good” if and only if it indicates to the agent or to a subsystem of the agent that 
the belief is true. If a reason justifies a belief, one of the following conditions 
holds: if the reason is itself no belief, then the process that leads from the reason 
to the belief, and that causes or sustains the belief, is justiRcation- producing, 
i.e., its outcome is the reason’s indication of the truth of the justified belief to 
the agent. If the reason is itself a belief, then it has to be justified itself, and 
the process that leads from the reason to the belief, and that causes or sustains 
the belief, is justification- transmitting (this is Audi’s term; cf. [12], chapter 6), 
i.e., its outcome is the transmission of the truth-indicating character of the 
reason to the justified belief. The latter clause concerning reasons being beliefs 
themselves, points to an ambiguity that affects the notion of a good reason 
if the reason is itself a belief: first of all, the reason has to be good in the 
sense of being justified itself; in this case, ‘good’ is a unary predicate. But, 
secondly, a belief that is a good reason is also good for the belief that it is 
a reason for, i.e., there is a justification-transmitting process that leads from 
the reason to the justified belief - here ‘good for’ is a binary predicate that 
might or might not be analyzed as being good simpliciter and being a cause of 
the justified belief, depending on what theory of justification is involved. If a 
reason is not itself a belief, then perhaps only the binary notion of ‘goodness’ 
is used. We will focus on justification-transmission when we deal later with the 
justification of inferences. An explication of the justification- “production” by 
a reason is beyond the aims of our exposition. 

The axiological qualification of reasons as being good is often meant 
to entail the deontological qualification of justified beliefs as being permitted. 
If we hold a belief for a good reason, the belief is “epistemically permissible” , 
where ‘permissibility’ does, of course, not have a moral connotation here but 
just an epistemological connotation (the relation between epistemic and moral 
appraisal has been focused on, e.g., by Chisholm: see Haack[76] for an overview 
of Chisholm’s contributions on that topic; see Pollock[125], p.ll, for more on 
epistemic permissibility). It implies that no cognitive agent is to be “blamed” 
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or “accused” for holding a justified belief, at least not from an epistemological 
point of view (but blamelessness does not by itself entail permissibility - see 
Moser [111], p.40). Equivalently: it is not epistemically obligatory to abandon a 
justified belief. On the other hand, an agent is normally not obliged to have a 
certain justified belief, since the agent might refrain from holding the belief for 
purely non-epistemological reasons, which is certainly permissible; e.g., the be- 
lief might be simply irrelevant with respect to what the agent desires to do. The 
epistemic permissibility of a justified belief may also be characterized in the way 
that the justification of a belief contributes to its being knowledge rather than 
mere belief. If justification is correspondingly understood as that “what turns 
true belief into knowledge” , then every theory of knowledge has to include a 
theory of justified belief as a (proper) part^. Therefore theories of justification 
are often considered to be primary to theories of knowledge, since the latter 
are often assumed to presuppose the former. ^ According to Goldman[69], and 
also to Pollock[125], “the central question of epistemology concerns the justi- 
fication of belief rather than knowledge” (Pollock[125], p.l2). The justification 
of belief is interesting in itself, independently of whether a successful account 
of justified belief contributes to our understanding of knowledge or not. Thus 
we are mainly going to neglect the notorious questions concerning knowledge 
but rather concentrate on justification. 

5.1.2 Externalist Justification of Beliefs 

The second family of theories of justification consists of the so-called externalist 
theories. A typical characteristic of such theories is that the phrase ‘held by the 
agent for good reasons’ is taken holophrastically, i.e., not divided up into parts 
at all; ‘held by the agent for good reasons’ is thus not regarded as a complex 
predicate but rather as a primitive predicate of the form ‘held-by-the-agent- 
for-good-reasons’. In particular, if a belief is justified, this does not necessarily 
entail that there was an entity that would be a reason for the justified belief, 
s.t. this entity would be “good”. An agent’s belief that \g) is true] is rather 
held-for-good-reasons by the agent if it is part of a belief system which stands 
in some sort of correspondence to the facts, i.e., which is “truth-linked”. There 
is not (necessarily) something that indicates the truth of (f to the agent, but 
the fact that [(p is true] plays itself a role in the causation or sustaining of the 
agent’s justified belief that [p is true]. Theories of justification that approach 
justifiedness in this way are usually called ‘externalist’. The main differences be- 
tween internalist and externalist theories are: the former assume the existence 
of reasons in a both causal and normative sense, s.t. reasons are cognitively 
accessible to the agent. Externalist theories usually do not use the notion of 
a reason at all - at best, they might regard facts or processes as reasons, or 

According to the classical account of knowledge, knowledge simply is justified true belief. 
However, this account is under attack, at least in its more simple-minded versions, since 
Gettier’s seminal paper [62]. 

^But that is of course not always the case: cf. Williamson[176] for a recent counterexample. 
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whatever has some causal role relative to the belief to be justified - and, as 
Brendel[24], pp. 188-189, points out, the agent is not supposed to be familiar 
with or even consciously aware of the justifying relation between the fact that 
[if is true] and the agent’s belief that [(f is true]^. A prime example of such an 
externalist theory of justification is “process reliabilism” : an agent’s belief is 
justified if and only if it is produced (see def.204 in chapter 21), i.e., caused or 
sustained, by a reliable process. In the more evolved reliabilist theories there 
are additional properties that are claimed to be necessary of justified beliefs; 
e.g, Goldman[67], p.l23, adds a kind of defeasibility condition: “If 5’s belief 
in p at t results from a reliable cognitive process, and there is no reliable or 
conditionally reliable process available to S which, had it been used by S in 
addition to the process actually used, would have resulted in 5’s not believing 
p at t, then 5’s belief in p at t is justified.” But let us concentrate just on the 
original idea of reliabilism. A process that leads from a non-belief to a belief is 
justification-producing not because the non-belief is a good reason for the belief 
in the internalist sense, but because the process itself has the property of being 
reliable; accordingly, a process that leads from a belief to a belief, and that is 
conditionally reliable (this is Goldman’s [67] term), i.e., reliable given the justi- 
fication of the former belief is justification-transmitting. A process of the first 
kind is reliable if and only if it tends to produce, i.e., to cause or sustain, more 
true beliefs than false beliefs. A process of the second kind is conditionally reli- 
able if and only if it tends to produce, i.e., to cause or sustain, more true beliefs 
than false beliefs given that its “input-beliefs” are true (this notion of reliability 
will be dealt with more precisely in chapter 7). Thus, a justification-producing 
process is nothing but a truth-producing process; a justification-transmitting 
process is nothing but a truth-transmitting (or, as it is more often called, a 
“truth-preserving” ) process. Neither ‘truth-producing’ nor ‘truth-transmitting’ 
is meant to be exceptionless, i.e., a truth-producing process does not necessarily 
always produce true beliefs, just as a truth-transmitting process does not nec- 
essarily always transmit truth; such processes are only demanded to be reliable 
to a certain' degree, ‘truth-producing’ and ‘truth-transmitting’ are sometimes 
subsumed under the term ‘truth-conducive’, which we regard as being synony- 
mous to ‘reliable’. 

The best known advocate of such a theory is Alvin Goldman (e.g., 

§ Goldman [69], p.24, formulates the externalism/internalism distinction in terms of the 
conditions of justifiedness: “Theories that invoke solely psychological conditions of the cog- 
nizer are naturally called ‘subjective’, or ‘internalist’, theories. Theories that invoke such 
matters as the actual truth or falsity of relevant propositions are naturally called ‘external- 
ist’ theories (assuming, at any rate, some realist approach to truth).” Pollock [125], p.24, 
and p.26, uses a similar terminology, except for enlargening the class of externalist theories: 
^^Intemalism maintains that the justifiability of a belief should be a function of our internal 
states.”; “Externalism is the denial of internalism. According to externalism, more than just 
the internal states of the believer enter into the justification of beliefs.” Both authors do not 
consider conscious accessibility to reasons as a defining property of internalist theories, as it 
is the case for what we have called the classical internalist theories. 
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in [67]): his theory, or better, his theories of justification are summarized and 
criticized in the second chapter of the appendix. Goldman’s theory is in many 
respects a paragon of the theory of justified inference that we are going to 
develop, although it has been under attack (mainly from the internalist side 
of course) for various reasons. One of its most urgent problems is called the 
‘generality problem’ (see our summary in section 22.5.2 in the appendix)^: 
usually, one and the same belief state of an agent is not caused or sustained 
at a time t by just one process but by more than one process; these processes 
might be very specific ones (in our theory: “small” sets of trajectories), or 
they might be very general ones (in our theory: “large” sets of trajectories), 
or they might be somewhere in between. In the extreme case, the agent’s total 
cognitive activity might be referred to as “the” process that produces the belief 
in question. Is it sufficient for justification that at least one of those processes is 
reliable, even if this reliable process is individuated counter-intuitively, being a 
more or less theoretical entity that no psychologist would ever count as a proper 
mental process? Or, maybe, is the existence of at least one reliable process that 
causes or sustains a belief simply not sufficient to ensure justification? E.g.: 
let us assume that some belief of A is caused by a reliable process Pri, since 
A is disposed to initiate Pri under certain circumstances, and indeed these 
circumstances happened to be the case. Prima facie, the so-produced belief is 
justified (according to a reliabilist theory). But now let us furthermore assume 
that the agent’s disposition to initiate Pri has itself been caused by another 
process Pr 2 that is indeed highly unreliable. Should we still count the belief 
as justified or rather as unjustified? In the first case we would argue that A's 
belief has been produced by a reliable process, i.e., Pri; in the second case we 
would reply that A's belief has been jointly produced by an unreliable and a 
reliable process, i.e., by the composition Pri oPr 2 of Pri and Pr 2 - but if Pr 2 
is sufficiently unreliable, then this composition will be an unreliable process. It 
might be regarded as a response to this problem that Goldman[69] introduces 
the distinction of basic and of acquired processes, of first- and second-order 
processes, of first- and second-order reliability, and of primary, secondary, and 
full justifiedness (see section 22.2 in the appendix; we use Goldman’s original 
terminology): a (belief-producing) process is basic if it is part of the fixed, 
native architecture of the cognitive system - later, we will understand the term 
slightly differently, s.t., a process is basic at the time t relative to the cognitive 
“history” of the agent, if it is part of the cognitive architecture of 

the agent at every point of time f before t, where t' is contained in the index 
set in - a basic process in this latter sense has always been part of the agent’s 
cognitive architecture before t, but it might cease to be so after t, whereas a 
basic process in the former sense is part of the agent’s cognitive architecture in 



^The other problems are (i) the problem of defining reliability (see section 22.5.3 in the 
appendix, and chapter 7 below), and (ii) the problem of subjective justification (see section 
22.5.4 in the appendix). 
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all possible trajectories; but these differences are not so relevant now. A process 
is acquired if it is not basic but itself produced by another process. E.g.: 

Example 36 

1. (In terms of computers) 

Consider a logical circuit which is a fixed part of the hardware of a com- 
puter system: the system’s state of having such a circuit, i.e., the set of all 
parameter- settings, is a dispositional state, and - let us assume this - it 
might even be a general belief state of the computer system; an inference 
based on the latter belief would thus be a basic process. On the other hand, 
if the circuit was alterable, i.e., ^^reprogrammable” in som.e way, and if it 
had indeed been altered, any inference process caused by the circuit would 
have been acquired. 

2. (In terms of connectionist networks) 

If the weights which are associated with the edges of a connectionist net- 
work are fixed, and if the very dispositional state of the network, in which 
the weights are set in precisely this way, i.e., the set of all parameter- 
settings, is a general belief state, then an inference process based on this 
belief is basic. On the other hand, if the weights were alterable by some 
learning procedure, and if they had indeed been altered, the inference would 
have been acquired. 

Goldman calls a process-producing process “second-order” if it - di- 
rectly or indirectly - produces first-order processes, where a first-order process 
is a process that produces beliefs'L E.g., inferences as we have defined them in 
part I are typical first-order processes. Learning processes by which methods 
or algorithms are learnt are typical examples of second-order processes. For a 
further example reconsider example 36 above: 

Example 37 

1. (In terms of computers) 

In example 36 above, the computer inferences are first-order processes, 
whereas the reprogramming of the computer’s circuit would both be a 
first- and second-order process (first-order because also a general belief 
is formed). 

2. (In terms of connectionist networks) 

In example 36 above, the inferences by the network are also first- order 
processes, whereas the network’s learning process would both be a first- and 
second-order process (first-order because also a general belief is formed). 

II Note that Goldman’s term ‘first-order’ has nothing to do with first-order logic, and the 
same holds for ‘second-order’ and second-order logic. 
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In both cases, the second- order processes lead to dispositional states. 

If a belief is caused or sustained by a basic process that is also first- 
order reliable, i.e., reliable in the sense sketched above, then the belief is justified 
according to Goldman’s theory; however, if the belief is caused or sustained by 
an acquired process, say, Pr\ , that is in turn caused or sustained by a second- 
order process Pr2, s.t. Pr2 is not second-order reliable, i.e., where Pr2 does not 
tend to produce reliable processes, then the belief is not justified. At best, the 
belief may be said to have “primary just ifiedness” , if Pr\ is (first-order) reliable; 
but it lacks “secondary justifiedness”, since Pr2 is not (second-order) reliable. 
Thus, it also lacks “full justifiedness” which is the conjunction of primary and 
secondary justifiedness and which concerns both Pr\ and Pr2. On the other 
hand, if Pr2 were indeed second-order reliable, then the belief would be “fully” 
justified. This solves one aspect of the generality problem, but other aspects 
remain problematic: (i) if Pr2 is e.g. itself acquired by another process Pr^, 
then also Pr3 should be demanded (third-order) reliable for the justifiedness of 
the belief concerned, and this may be iterated further;** (ii) if Pr2 is basic, then 
the agent’s disposition to initiate Pr2 might e.g. be a substate of a dispositional 
belief of A that is innate; but is every innate belief of A automatically justified 
just because it has not been produced by any kind of process? Or is it only 
justified if it has been produced by some reliable pre-cognitive “growth” process 
of the agent, and if so, what is the difference then between a basic process and 
an acquired process? (iii) Besides Pri and Pr2, the “justified” belief may have 
additionally been caused or sustained by a basic process Pr[ that is not first- 
order reliable; and even if Pr[ was reliable, it might be the case that Pr[ was 
itself caused or sustained by a not second-order reliable process Pr^: how do 
we decide on justifiedness in such a case? We will return to these points when 
we discuss the justification of inferences. 

5.1.3 Comparison 

Externalist theories of justification make justification differ substantially from 
(theoretical) rationality. In this way they can explain why we sometimes call 
an agent’s belief ‘justified’ (a) although the agent is not able to argue for it by 
stating the reasons for which she holds the belief, (b) although she does not 
have a mental concept of reasons or of justifiedness, and (c) although conscious 
accessibility does not play a role. E.g., a dog’s belief that he is going to be fed 
might be called ‘justified’, if, say, it is 1 pm, he watches me prepare the food 
in the usual way, the dog always gets his food at 1 pm after my preparing the 
food, the dog has normal perceptual, inferential, and memorial capabilities and 
he has applied them in the recent past in the normal way. The dog is certainly 
not able to argue for his belief, at least not by external communicative means; 
he might neither be consciously aware of his occurrent belief concerning the 
fact that it is about 1 pm, nor of his occurrent perceptual belief concerning my 

**This problem seems to have been overlooked by Goldman. 
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preparing the food, nor of his general, dispositional belief concerning his being 
fed at a typical time and after typical preparations. The dog (presumably) also 
does not reflect on whether these beliefs are reasons. In such a case, the dog’s 
belief is deflnitely not justifled according to classical internalism. But according 
to a reliabilist theory of justiflcation, the dog’s belief is indeed justifled, since 
the introspection process that has led to his belief that it is lunch time is 
reliable, the perceptual process that has led to his current perceptual belief 
concerning my preparing his food is reliable, and, similarly, the perceptual and 
inductive processes that have led to his general belief that he is normally fed 
at a certain time (around 1 pm) after certain preparations are reliable. This 
raises the question whether the classical internalist theories of justiflcation and 
the externalist theories of justiflcation are simply contradictory, s.t. at least 
one family of theories is seriously flawed, or whether they do not so much 
contradict each other but rather aim at the explication of different notions of 
justifledness. This question seems to be unsettled in the relevant literature. 
But, as we will outline below, the notion of justifledness that is explicated in 
classical internalist theories of justiflcation is indeed inappropriate as far as the 
low-level theory of justifled inference is concerned which we are aiming at. 

On the other hand, the distinction between internalist and external- 
ist theories of justiflcation should also not be overestimated for the following 
reasons: 

• an internalist theory of justiflcation where conscious access to reasons 
is not presupposed, is often hard to distinguish from an externalist the- 
ory. Of course, a process that is justiflcat ion-producing according to an 
internalist theory is not so much truth-producing as in the reliabilist 
case but rather truth- mdica^zon-producing; a process that is justiflcat ion- 
transmitting according to an internalist theory is not so much truth- 
transmitting as in the reliabilist case but rather truth- mdzca^mn-trans- 
mitting. But since a reason is only good according to an internalist theory 
of justiflcation if it is ^r-u^/i-indicative, internalist theories of justiflcation 
at least seem to share the conviction with the externalist theories that the 
justifledness of a belief hints at the truth of the belief while its unjustifled- 
ness would not do so - notwithstanding certain pragmatist approaches to 
internalist justiflcation. 

• Externalist theories assess mental (though not necessarily conscious) 
states and processes just as their internalist counterparts do, and thus 
they have a subjective component. 

• Both kinds of theories do not only deal with the normative basis of be- 
liefs, but also with their causal basis, and thus their advocates are forced 
to regard the context of justiflcation to be intimately connected to the 
context of discovery (adopting the classical terminology of Reichenbach) , 
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as long as the justification of mental states and processes is intended (and 
not of sentences or theories). 

• Externalist theories of justification may contain the same normative bridge 
axioms as internalist theories do; e.g., Goldman[69] regards the following 
as analytically true: ‘a belief is justified if and only if it is permitted by a 
right system of justificational rules’ (see p.330 in the appendix). 

• Finally, according to both families of theories, holding justified beliefs is 
normally not only a matter of cognitive virtue, but of utmost practical rel- 
evance - although this is rarely pointed out by epistemologists in order to 
separate epistemic justification from “mere” prudential justification (and 
sometimes also to separate themselves from pragmatist approaches). Ac- 
cording to our explication of the notion of belief in chapter 3, beliefs have 
an action-guiding function, and thus they play a major role in the cogni- 
tive activities of an agent. In particular, whether an agent is able to attain 
the goals that she has set for herself, and, even more urgent, whether she 
is able to survive in a complex and potentially hostile environment, is 
partially dependent on whether her beliefs are true. If the agent believes 
that [(p is true] , she is guided to act in a manner that is “appropriate” to 
the truth of p\ but if p is false, then she is perhaps guided by her belief 
that [p is true] to act in an inappropriate manner, at least if there are not 
some contingent circumstances that compensate her false belief by lucky 
coincidence. That is why it is normally favourable for an agent not only 
to believe something, but also to justifiedly believe it, since this raises the 
chances to believe something that is rather true than false, both accord- 
ing to internalist and externalist theories of justification. The longer an 
agent holds a certain false belief, the more risk she takes that the belief 
might guide her activities towards an unintended situation, since chances 
rise that her inappropriately chosen actions are no longer compensated 
by random factors. It is thus not only a question of cognitive or scientific 
responsibility to have justified beliefs, but also a question of practical per- 
formance “in the long run”; this pragmatic feature of justifiedness is due 
to its “truth-directed” character, and not the other way round - contra 
some of the pragmatist theories. 

In his attack on analytic epistemology, and, in particular, against ex- 
ternalist accounts of justification like Goldman’s, Stich[170] denies that 
truth-conducive processes are of practical value: on p.61, he points out 
that “. . . strategies of inference or inquiry that do a good job at gener- 
ating truths and avoiding falsehoods may be expensive in terms of time, 
effort, and cognitive hardware”, and on p.62 he adds “a very cautious, 
risk- aversive inferential strategy will typically lead to false beliefs more 
often, and true ones less often, than a less hair-trigger one that waits 
for more evidence before rendering a judgment. Nonetheless, the unreli- 
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able, error- prone, risk- aversive strategy may well be favoured by natural 
selection ...” 

But our claim is not that a truth-conducive process is necessarily also 
useful; what we claim is that true beliefs are, in the long run, preferable 
over false beliefs, even from a practical point of view. If, e.g., two processes 
are equally expensive “in terms of time, effort, and cognitive hardware”, 
s.t. the one process is more reliable than the other one, the former process 
is surely practically preferable. Furthermore, reliability comes in degrees: 
in one case an agent may demand a high degree of reliability, i.e., of 
justification, in order to base an action on the thus justified belief; in 
another case, a lower degree is regarded as sufficient. But in any case a 
high degree of reliability is preferable, ceteris paribus, also in practical 
respects. 



5.2 The Ascription of Justified Belief 

Independently of how we interpret the phrase ‘being held for good reasons’, 
there are certain general constraints on ascribing justified beliefs to a system. 
These constraints result from the theory of belief as developed earlier. Accord- 
ing to this theory, we can speak of a belief as being justified only relative to 
(i) a parameter-setting s{t) (at time t) in which the belief is justified, and 
in which it is caused or sustained by the good reasons, by a reliable process 
or by something else, and to (ii) a trajectory {s{t))^^j^ of A, s.t. s{t) is the 
parameter-setting of A at time t in the trajectory {s{t))^^j^. This is for the fol- 
lowing reasons: at first, according to the explication in part I, we regard beliefs 
as state types; but although it makes good sense to say that something causes 
or sustains a belief state type of an agent (this might e.g. be expressed by a 
general causal sentence), it does not make sense to say that a belief state type 
of A is held for good reasons without any reference to a particular parameter- 
setting at a particular time, since belief state types are only held in particular 
parameter-settings of A. The ascription of a justified belief is therefore only 
possible relative to a given parameter-setting. Secondly, when we say that a 
belief state type is held in a parameter-setting s(t), s.t. the belief state type 
is caused (in s{t)) by good reasons or by a reliable process, it is entailed that 
the reasons or the process must have been “there” (say, at t — At) before the 
justified belief is caused at time t. Thus, the truth of a justified-belief ascrip- 
tion is dependent on the cognitive “history” of the agent. Instead of referring 
to the parameter-setting at t — At and the point of time t — At, we may simply 
refer to the whole trajectory {s{t))^^j^ (both s{t) and s(t — At) are members 
of the trajectory); if the justified belief is caused by a reliable process F, we 
even must refer to a trajectory of parameter-settings, since we have assumed 
processes (like P) to be sets of trajectories, and a belief is caused at t relative 
to {s{t))^^jj^ by a process if (s(t — At), . . . , s{t)) is a member of the process (for 
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some Thus, again, we can speak of a belief being caused by a reliable 

process only relative to the agent’s cognitive history. Thirdly, in the case of 
sustaining, we may indeed disregard the reference to the parameter-setting at 
t — At, since the sustaining of a belief by reasons or by a process at t does not 
depend on ^’s cognitive past; but we still have to refer to the parameter-setting 
s{t) in which the justified belief is held by the agent. If we take this together, we 
see that justifiedness is actually not a property of beliefs, but rather a relation 
between beliefs, points of time, and trajectories; the point of time determines 
the parameter-setting in the trajectory in which the justified belief is held. 
When we say that a belief is justified simpliciter, we usually have in mind a 
particular point of time and also a particular trajectory of parameter-settings, 
and we only suppress the reference to these additional parameters in order to 
simplify matters. 

Now let us focus on the logical form of justified-belief ascriptions: we 
might either say that ^’s belief that [(p is true] is justified at t relative to 
{s{t))^^j^^ or we might say that A believes at t relative to {s{t))^^j^ justifiedly 
that [p is true]. In the latter case we have made use of a justified-belief operator 
in the metalanguage. In the former ascription of justified belief we have used a 
ternary predicate in the metalanguage which has been applied to the name of 
a belief state of A, the name of a point of time, and the name of a trajectory 
of A. In the case of the justifiedness ascription to belief states by means of a 
predicate, we face the same complications that have urged us at the beginning 
of section 4.4 in part I to choose a reason- for operator instead of a reason- 
for predicate of belief states: it might e.g. be the case that ^’s belief state 
that [p is true] is identical to A^s belief state that [tp is true], i.e., for all 
parameter-settings s, A believes in s that [p is true] if and only if A also 
believes in s that [ip is true], let us furthermore assume that ^’s belief that [p 
is true] is indeed justified at t relative to by identity, this would entail 

that A also believes at t relative to {s{t))^^j^ justifiedly that [ip is true]. The 
justifiedness of A's belief that [p is true] would thus also depend on whether 
A believes at t relative to {s{t))^^j^ justifiedly that [ip is true]. In order to 
circumvent such complications, we shall only ascribe justified beliefs with the 
help of a (metalinguistic) justified-belief operator. In the following we are going 
to use both kinds of justified-belief ascriptions syntactically, but when we use 
a predicate ascription this should always be translated tacitly to its operator 
counterpart. 



^^See def.204 in the appendix for a precise definition of ‘a process Pr causes a state X at 
t (relative to 

further possibility would be to speak only of belief tokens as being justified; but 
that would complicate our theory (of belief, of inference, of justified belief, and of justified 
inference) in other respects. 




Chapter 6 

AN INFORMAL ACCOUNT OF OUR THEORY OF 
JUSTIFIED INFERENCE 



6.1 Intuitions 

Contrary to beliefs, inferences do not have an action-guiding function them- 
selves, but they are nevertheless part of an action-guiding system by means of 
their producing beliefs that do have such function. Inferences are auxiliary pro- 
cesses by which agents generate new beliefs; inference dispositions are auxiliary 
devices by which agents may generate new beliefs. Accordingly, the justifica- 
tion of inferences is derivative on the justification of beliefs, or, more precisely, 
the justification of an inference depends on the justification of its conclusion 
belief. However, it would be wrong to define an inference to be justified simply 
if its conclusion belief were justified, since the unjustifiedness of the latter belief 
might merely be the consequence of the unjustifiedness of the premise belief of 
the inference; when we call an inference justified, we actually want to assess 
just the inference process, not its input. This may again be achieved by either 
presupposing an internalist theory of justified belief, or an externalist theory of 
justified belief (or a mixed theory, but we will again only consider the extreme 
cases) . 

6.1.1 Internalist Justification of Inferences 

Let us first focus on the justification of inferences from an internalist point of 
view. According to the theory of inference that we have developed in part I, 
the (perhaps total) premise belief of an inference is a reason for the conclu- 
sion belief of the inference, s.t. the reason causes or sustains the conclusion 
belief. According to an internalist theory of justification, the conclusion belief 
is justified if and only if it is held for good reasons, s.t., if the good reasons 
are beliefs, then they are justified themselves, some process that leads from the 
reasons to the belief is justification-transmitting, the reasons and the process 
are cognitively accessible to the agent, and the truth-indicating character of 
the reasons is transmitted to the justified belief. 

When we call an inference process ‘justified’, (i) we are only inter- 
ested in the premise belief as a reason for the conclusion belief, (ii) we do 
not regard some process that leads from the premise belief to the conclu- 
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sion belief as justification- transmit ting, but we rather regard the inference 
process itself (that leads from the premise belief to the conclusion belief) as 
justification-transmitting, and (iii) we want to express that the inference pro- 
cess is justification-transmitting, independently of whether the premise belief 
is really justified. It is a dispositional property of inferences to be justification- 
transmitting. As Audi[12], p.l59, formulates this: “inference is not a basic 
source of justification or knowledge, but rather transmits and thereby extends 
them, in appropriate circumstances, from . . . premises to the conclusion in- 
ferred from them.”. The justification of inferences is thus a conditional notion. 
The only noteworthy exception is the case of an inference from a trivial premise 
belief (e.g., in the truth of the tautologically true T), where the justification- 
transmission from the trivial premise and the justification-production from 
scratch might perhaps be said to coincide. Instead of speaking of justification- 
transmission, we can equivalently say: an inference is justified if and only if 
it is the case that, given that its premise belief is justified^ its conclusion be- 
lief is justified precisely because its premise belief is justified. But it would be 
insufficient to define an inference to be justified if and only if, given that its 
premise belief is justified, also its conclusion belief is justified, since there might 
be another good reason for the conclusion belief, s.t. the conclusion belief of 
the inference is not justified because its premise belief is justified but for the 
additional reason. In such a case, we would not be entitled to say that the 
inference from the premise belief to the conclusion belief was justified. Rather, 
an inference is justified only if the justification of its premise belief is trans- 
mitted to its conclusion belief, and the justifiedness of the conclusion belief is 
not produced by or transmitted from another mental state. 

If we added a consciousness requirement to our (then classical) inter- 
nalist theory of justified belief and inference, we would furthermore demand our 
agent to be consciously aware of the premise belief, of the conclusion belief, and 
of the inference process that is going on. 

Note that although the justification of an inference depends on the 
justification of its conclusion belief only given the justification of its premise 
belief, we do not speak of a justified inference given that the general beliefs 
that it is based on are also justified: as we have pointed out in part I, the 
latter general beliefs are not further premises of the inference, and therefore 
their being justified is not presupposed. In the analogy to logical derivations 
that we have drawn in part I, the general (and thus dispositional) beliefs that 
an inference is based on do not correspond to formulas but rather to rules of 
inference that are applied to premises. But we would not call the application 
of such a rule justified only given that the rule itself is justified; instead, we 
would call the application of a rule justified if the result (the conclusion) of the 
rule application is justified given and due to the justification of its input (the 
premise) . 
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6.1.£ Externalist Justification of Inferences 

Now let us turn to the externalist justification of inference, and let us focus on 
a process-reliabilist account of justification: according to such a type of theory, 
the conclusion belief of an inference is justified if and only if it is held-for-good- 
reasons, i.e., if and only if it is caused or sustained by some reliable process, s.t.: 
if the reliable process leads from a belief to the justified belief, then the former 
belief is justified itself, and the process is conditionally (first-order) reliable, 
i.e., justification-transmitting by being truth-transmitting to some high extent 
given the truth of the inital belief; if the process has itself been acquired by 
some further (second-order) process, then also the latter process is (second- 
order) reliable. Where we call an inference process ‘justified’, (i) we do not 
regard some process that leads from the premise belief to the conclusion belief 
as reliable, but we rather regard the inference process (that leads from the 
premise belief to the conclusion belief) itself as reliable, and (ii) we want to 
express that the inference process is not reliable simpliciter but conditionally 
reliable, i.e., reliable given that its premise belief is justified, independently of 
whether the premise belief is really justified. Just as before, an inference is 
justified only if its reliability is conditional on the justification of the premise 
belief; it is not justified if the justifiedness of the conclusion belief had been 
“produced” by a further reliable process that was different from the inference 
process. The generality problem that affects the reliabilist account of justified 
belief does not affect the reliabilist account of justified inference if the inference 
in question is a basic process, since we now do not have to define justified 
beliefs by reference to some reliable process, but we rather assess the processes 
themselves: but inference processes have been defined precisely in part I (as sets 
of sequences of parameter-setting satisfying some constraints) , and we have also 
defined under which conditions our agent A draws a certain inference. If the 
inference process is basic, no other process plays a significant role concerning 
the justification of the inference process, and thus a basic inference is justified 
if and only if it is justification-transmitting, i.e., if and only if it is conditionally 
reliable. It is just in the case of acquired inferences that the generality problem 
reoccurs, since the justification of an acquired process depends not only on the 
conditional (first-order) reliability of the inference itself, but also on how the 
acquiring process is identified, and on how (second-order) reliable this acquiring 
process is, i.e., to what a degree it tends to produce more reliable first-order 
processes than unreliable ones. 

We are next going to argue that the notion of justifiedness that we are 
after is not the notion of justifiedness that is explicated in the usual internalist 
theories of justified belief and inference. 
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6.2 The Low-Level Postulate 

As we have seen in part I, our definition of the notion of inference applies also to 
agents which have rather restricted cognitive means, but which are nevertheless 
able to draw inferences. Neither consciousness, nor any “higher-order” refiective 
mental state, nor the capability of internal or external argumentation has been 
presupposed. Thus, our definition of the notion of inference allows for low- 
level agents drawing inferences. Indeed, we hypothesize that in fact all natural 
agents of some minimal degree of complexity (which is much below the one of 
humans) draw inferences*: just reconsider the cat&bird example 30 from the 
last chapter. But that is of course an empirical question which is to be settled 
by biologists and psychologists. 

Furthermore, our explication of the notion of inference has been per- 
fectly consistent with the possible case of an artificial agent (e.g., a computer) 
that draws inferences. 

Now we will complement our low-level approach to inference by an 
epistemological “low-level postulate” that is going to shape our theory of jus- 
tification for inferences: 

• (The Low-Level Postulate on How We Understand Justification) 

Whatever our definition of ‘justified inference’ (or of ‘justified belief’) 
may look like, it should be satisfiable also by low-level agents; i.e., our 
definition should not preclude that agents of a complexity level below the 
level of human beings draw a significant number and variety of justified 
inferences (or have a significant number and variety of justified beliefs). 

The postulate articulates the following intuition: the cat agent referred 
to in our cat&bird example is not only drawing a certain normic inference - she 
is even justifiedly doing so. Although the cat is (as we presume) not refiecting 
on what kind of mental process is going on when she draws that inference, 
although she is definitely not able to argue in favour of her conclusion, and 
although she is perhaps not consciously aware of the reason for her conclusion 
or of the inference process itself, there still seems to be a well-defined sense of 
‘being justified’ according to which we may say: under the conditions of the 
cat&bird example, A justifiedly draws the nonmonotonic inference from the 
total belief that [Bird{a) is true] to the belief that [CanFly{a) is true]. Thus, 
given that the cat’s initial total belief that [Bird{a) is true] is actually justified, 
also her conclusion belief that [CanFly{a) is true] is justified. It is this sense 
of ‘being justified’ that we are now interested in. Moreover, the cat is definitely 
able to draw a significant number of similar inferences. 

*In particular: we hypothesize that they draw, to a high extent, high probability or normic 
inferences. 
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The theory of justification to be developed for inferences is constrained 
by the low-level postulate in so far as the latter excludes any of the typical 
internalist (in particular: classical internalist) conceptions of justification, since: 
(i) we assume the cat in the cat&bird example to have justified beliefs, and to 
draw justified inferences; (ii) according to an internalist theory of justification, 
a justified conclusion belief of an inference is held for the premise belief being 
a good reason for the conclusion belief, given that the premise belief is justified 
itself; (iii) an internalist theory of justification demands the cat to be able to 
refiect on the premise belief, to judge it as being a good reason for the conclusion 
belief, and (iv) - in the case of a classical internalist theory - to be consciously 
aware of the premise belief, the conclusion belief, and the inference; (v) but the 
cat is not able to do what is demanded in (iii), and she is not necessarily in the 
state that is described in (iv), or so we presume. 

This does of course not exclude the possibility of future biological, 
neurological, or psychological findings which show that the demands stated 
in (iii) and/or (iv) are indeed satisfied by low-level agents; but at this point 
we hypothesize that the internalist family of theories of justification is too 
much a counterfeit of our introspective account of human, rational, and even 
scientific reasoning in order to serve as the theoretical framework for the notion 
of justifiedness for low-level agents that we are interested in. 

We are therefore going to develop our theory of justified inference as 
an externalist kind of theory; moreover, we opt for a process-reliabilist the- 
ory in the style of Goldman. This theory is going to lead, as we claim, to a 
satisfying theory of justified inference, since (i) it conforms to our low-level 
postulate, (ii) it explains why the cat’s inference from above is justified and 
why it thus subserves the cat’s needs under normal circumstances by increas- 
ing her cognitive “fitness” , and (iii) the reliability approach is - though often 
not recognized as such! - underlying many of the modern developments in the 
field of Nonmonotonic Reasoning, both in its logical-mathematical, and in its 
technical-implementational respects. According to our theory, it turns out to be 
possible for a low-level agent to draw justified inferences. The term ‘low-level 
agent’ itself will not be defined precisely since we do not know of any way how 
one could this in a more explicit way - but we will give specific examples of 
what kinds of agents are not low-level agents. Whether some particular “real- 
world agent” satisfies the assumptions on our agent A that we have taken for 
granted is an empirical question. 



There is a further postulate which we are going to presuppose: whatever 
our definition of ‘justified inference’ may look like, it should be satisfiable in 
principle also by artificial agents (e.g., computer systems). This postulate is 



^The only exceptions that we are aware of are Pearl[118], section 10.1, and - more clearly 
- Schurz[146], [147], where the latter one approaches nonmonotonic reasoning explicitly by 
reliability considerations. 
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much less demanding than the postulate on the low-level approach - it just says 
that our epistemological theory should not be guilty of biologist ic prejudices: it 
is not within the reach of epistemology to decide ex cathedra whether artificial 
agents will in future be able to draw justified inferences or not; this is again 
subject to further empirical research. 

6.3 A Reliabilist Account of Justified Basic/ Acquired Inferences 

Since we turn now to a reliabilist theory of justified inference we have to 
deal with (i) the distinctions between basic and acquired inferences, first- and 
second-order processes, and first- and second-order reliability, which are im- 
portant for a theory of primarily /secondarily /fully justified inference, (ii) the 
generality problem concerning the justification of acquired inferences, (iii) the 
logical format of justified inference ascription, and (iv) what is meant precisely 
by ‘reliable’. We will consider topic (i) in the next subsection, topic (ii) in 
subsection 6.3.2, topic (iii) in subsection 6.3.3, and topic (iv) in chapter 7. Af- 
ter these preparations we can state our formal theory of justified inference in 
chapter 8. 

6.3.1 Distinctions and Examples 

Let us start by recapitulating the notions of ‘basic processes’ and ‘acquired 
processes’ introduced by Goldman: when Goldman refers to acquired processes, 
he speaks in the manner that a process could literally be acquired by an agent 
- but this is surely just an abbreviated way of saying that the agent acquires 
either an occurrent state in which the process in question is going on, or that 
she acquires a disposition to initiate this process under certain circumstances, 
or that she acquires the capability to initiate and sustain the process; this is 
because, according to common usage, one may e.g. acquire a bad habit, or a 
taste for something, or ownership of something, i.e., a state^ but not an activity. 
Since we are only interested in inference processes at this point, and since - 
according to part I - direct inferences are caused by the activation of general 
beliefs, i.e., of certain dispositional states, we can define basic/acquired direct 
inferences as direct inferences that are based on basic/ acquired general beliefs. 
A general belief is basic if it has not been acquired by A. An inference in 
general, i.e., an inference which is not necessarily direct, is basic if and only 
if each of its direct component inferences is basic; it is acquired otherwise. 
Inferences, as we have defined them in part I, are first-order processes which 
produce singular beliefs. The processes by which inferences are acquired, i.e., 
by which the general beliefs that the inferences are based on are acquired, are 
processes that produce general beliefs, and thus they are actually both first- 
and second-order processes, since (i) they are belief-producing, and (ii) they 
contribute to the generation of inference processes - though indirectly - by 
means of producing the dispositional states on which the inference processes 
are based. 
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Let us have a look at some examples: 

Example 38 

A robot agent infers nonmonotonically that the object right in front of 
her is something that is able to fly, from the total belief that the object right in 
front of her is a bird. The inference is based on the robot’s high probability belief 
that by far the most birds can fly, or on her normic belief that normal birds can 
fly. The robot has got one of the latter general beliefs because the belief state 
has been %uilt into” the agent by her creators; the general belief might, e.g., 
involve a fixed dispositional routine of the form: ‘‘if all that is contained in your 
factual (perhaps perceptual) knowledge base is a sentence expressing that there 
is a bird right in front of you, then add a sentence that expresses that there is 
something in front of you that is able to fly”. 

Example 39 

A robot agent infers nonmonotonically that the object right in front of 
her is something that is NOT able to fly, from the total belief that the object right 
in front of her is a bird. The inference is based on the robot’s high probability 
belief that by far the most birds can NOT fly, or on her normic belief that 
normal birds can NOT fly. The robot has got one of the latter general beliefs 
for analogous reasons as in example 38. 

Example 40 

A human agent infers nonmonotonically that the object right in front 
of her is something that is able to fly, from the total belief that the object right 
in front of her is a bird. The inference is based on the agent’s high probability 
belief that by far the most birds can fly, or on her normic belief that normal 
birds can fly. The agent has got one of the latter general beliefs because (i) 
she believes everything that she is told to believe (concerning birds) by someone 
whom she believes to have sufficient power and force; (ii) she has been told to 
acquire this general belief by precisely such a person. 

Example 41 

Reconsider example 40, but now given that the agent has got her general 
beliefs because (i) she believes everything that she is told to believe (concerning 
birds) by someone whom she believes to have sufficient knowledge and expertise 
on birds, after having tried hard and competently to find out whether the one 
who tells her so indeed has sufficient knowledge and expertise on birds; (ii) she 
has been told to acquire the general belief by precisely such a person. 

Example 42 (The Cat&Bird Example Reconsidered) 

The cat agent infers nonmonotonically that the object right in front of 
her is something that is able to fly, from the total belief that the object right 
in front of her is a bird. The inference is based on the cat’s high probability 
belief that by far the most birds can fly, or on her normic belief that normal 
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birds can fly. The cat has got one of the latter general beliefs because she has 
watched a lot of birds before, and by far the most of them have been able to fly 
(the cat could see them fly away when she tried to catch them); from this set of 
occurrent, singular perceptual beliefs she has inductively formed a hypothesis, 
namely, either the dispositional general belief that by far the most birds can fly, 
or the dispositional general belief that normal birds can fly. 

In each of the examples, the agent draws a nonmonotonic inference 
from the total belief that the object right in front of her is a bird to the 
belief that the object right in front of her is something that is able to fiy. The 
inferences referred to in the examples 38 and 39 are basic inferences; the other 
inferences are acquired inferences, where the inferences have been acquired by 
different kinds of acquiring processes. 

Now let us ask for the intuitive justifiedness of the example inferences: 
intuitively, 

1. the inference in example 38 is justified: if the robot’s total premise belief 
that the object right in front of her is a bird is justified, this justification is 
transmitted to the robot’s conclusion belief that the object right in front 
of her is something that is able to fiy, since this inference is (objectively) 
reliable; indeed, most birds can fiy, and, also, normal birds can fiy. The 
robot’s inference has not been acquired. 

2. the inference in example 39 is not justified: if the robot’s total premise 
belief that the object right in front of her is a bird is justified, this justifi- 
cation is not transmitted to the robot’s conclusion belief that the object 
right in front of her is something that is not able to fiy, since this infer- 
ence is unreliable; it is neither the case that most birds cannot fiy, nor 
that normal birds cannot fiy (in fact, even the corresponding denials hold: 
most birds indeed can fiy, and, also, normal birds can fiy). 

3. the inference in example 40 is not (fully) justified: if the agent’s total 
premise belief that the object right in front of her is a bird is justified, 
this justification is not fully transmitted to the her conclusion belief that 
the object right in front of her is something that is able to fiy. On the one 
hand, since this inference is (first-order) reliable, i.e., truth-preserving 
in most of the cases, and also in the normal cases, it might be called 
“primarily justified” . However, since the inference is based on a general 
belief that has been acquired by a (second-order) unreliable second-order 
process, namely the process of blindly believing everything that one is 
told to believe by one who is believed to have sufficient power and force, 
the inference is not “secondarily justified”. Normally, this second-order 
process leads to unreliable first-order processes; put statistically, it leads 
in most cases to unreliable first-order processes. Thus, the inference is 
not justification- transmit ting, since the process that leads to the general 
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belief that the inference is based on is not truth-transmitting - although 
the inference process that leads from the premise belief to the conclusion 
belief is itself truth-transmitting. 

4. the inference in example 41 is (fully) justified: if the agent’s total premise 
belief that the object right in front of her is a bird is justified, this jus- 
tification is transmitted to her conclusion belief that the object right in 
front of her is something that is able to fiy. Since this inference is (first- 
order) reliable, i.e., truth-preserving in most of the cases, and also in the 
normal cases, it might be called “primarily justified”. Moreover, since 
the inference is based on a general belief that has been acquired by a 
(second-order) reliable second-order process, namely the process of se- 
lecting competently and with care someone who knows about birds (and 
who is no liar, etc.), and then believing everything that this expert tells 
her about birds, the inference is also “secondarily justified” . 

5. the inference in example 42 is (fully) justified: if the cat’s total premise 
belief that the object right in front of her is a bird is justified, this justi- 
fication is transmitted to the cat’s conclusion belief that the object right 
in front of her is something that is able to fiy. First of all, since this in- 
ference is (first-order) reliable, i.e., truth-preserving in most of the cases, 
and also in the normal cases, it might be called “primarily justified” . Fur- 
thermore, since the inference is based on a general belief that has been 
acquired by a (second-order) reliable second-order process, namely the 
process of observing, and by an inductive reasoning process that consists 
of learning and generalizing from a lot of instance cases, the inference is 
also “secondarily justified”. 

In the examples 38-42 we have intuitively judged inference processes as 
being justified or unjustified. E.g., the inference in 39 is not justified because it 
is not primarily justified, i.e., the inference process is not first-order reliable, i.e., 
not truth-conducive; if the conclusion belief of the inference is sustained just by 
the inference, then it is consequently also not justified, even if the premise belief 
of the inference has been justified. The conclusion belief is not epistemically 
permitted to be held; put differently, the robot agent is epistemically obliged 
to abandon her belief. The robot can furthermore not be said to know that 
the object right in front of her is something that is not able to fiy, even if the 
latter is actually the case. If the conclusion belief is true, then it is so just 
by lucky coincidence. Finally, the robot’s disposition to draw this statistically 
faulty inference may give rise to practical troubles, since its activation does not 
lead to true beliefs most of the time. 

The inference in 40 is unjustified, but for different reasons than in 
39: although it is primarily justified since it is (first-order) reliable, it is not 
secondarily justified, because the general belief it is based on has been acquired 
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by a (second-order) process that is not (second-order) reliable, i.e., not truth- 
conducive. The same claims are true as have been in the previous case: if the 
conclusion belief of the inference is sustained just by the inference, then it is 
not justified, even if the premise belief of the inference has been justified. The 
conclusion belief is not epistemically permitted to be held; put differently, the 
agent is epistemically obliged to abandon her belief. She can also not be said to 
know that the object right in front of her is something that is able to fly, even 
if this is actually the case. If the conclusion belief is true, then it is again so 
just by lucky coincidence. Finally, although the agent’s disposition to draw this 
statistically praiseworthy inference will not give rise to any practical troubles, 
since its activation leads to true beliefs most of the time, the agent’s disposition 
to acquire inferences, i.e., general beliefs, in the way described in the example 
may indeed cause practical problems. By calling the inference process primarily 
justified, but not secondarily (and thus not fully) justified, we point out that 
the inference itself is reliable, but not the process by which the inference has 
been acquired. 

It might be argued that if the inference in 40 is unjustified, this is 
perhaps also the case for the basic first-order reliable inference in 38: although 
the robot’s inference has indeed not been acquired by the robots the general 
beliefs on which the inference is based have been “built into” the agent by her 
creators, and this might be an unreliable second-order process. The difference 
between this case and the inference in example 40 is that the second-order 
process is a cognitive process of the robot agent in the latter case but not in 
the former. But in an epistemological context, we should restrict the domain of 
states and processes that we study and assess just to cognitive ones - otherwise 
we would also have to deal with the “justification” of evolution (or divine or 
human creation), birth, growing up, etc. 

6.3.2 The Generality Problem Affecting the Reliabilist Justification of Infer- 
ences; Inductive Reasoning Processes as Second- Order Processes 

In each of the examples 40-42, we have explicitly described “the” processes by 
which the example inferences have been acquired. But we have already seen 
above that there might be more than one process that causes or sustains the 
general belief(s) on which the inference in focus is based, and that these pro- 
cesses might indeed differ concerning their degree of reliability. This is a serious 
problem for every reliabilist account of justification^, and we do not see any 
obvious and complete solution to this problem. Our pre- theoretical notion of 

^ However, we think that this is not only a problem for reliabilist theories but that there 
is an analogous problem that affects the internalist accounts of justification which do not 
presuppose conscious access: what are “the” reasons for which an agent holds a certain 
belief? And: by which mental process do the “good” reasons cause or sustain a justified 
belief? One possible answer to these questions would be a holistic ( “Quinean” ) account of 
justification for belief systems in which all possible reasons would be relevant, and in which 
the individuation of mental processes would therefore be largely irrelevant: this would be a 
version of a coherence theory of justification. 
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justification seems to rely both on a “natural” individuation of cognitive pro- 
cesses and a common sense assignment of beliefs to the very processes that 
these beliefs are supposed to be caused or sustained by. We suggest to com- 
plement this intuitive account by the following theoretical one: (i) identify the 
“plausible” candidate processes that cause or sustain a belief: this is usually 
not the task of a philosophical theory, but rather the topic of the best empirical 
state-of-the-art theories that psychologists, or neuroscientists, or computer sci- 
entists, etc., can offer, depending on what kind of cognitive agent is concerned; 
a “plausible” process is individuated as specifically as necessary in order to 
eliminate theoretically irrelevant factors, but it is also individuated as gener- 
ally as possible in order to give us a strong and non-trivial theory; (ii) if there is 
more than one such “natural” or “plausible” process, then relativize the notion 
of justification to the processes in question, and distinguish between different 
kinds or levels of justification; Goldman’s distinction of primary, secondary, and 
full justification is an example of such a relativization. When we state our the- 
ory of justified inference in the chapter 8, we will follow precisely this strategy: 
if an inference is basic, it is justified if and only if it is conditionally reliable; 
if an inference is acquired, then it is primarily justified if it is conditionally 
reliable, it is secondarily justified if it is acquired by a (second-order) reliable 
process, and it is fully justified if it is both primarily and secondarily justified. 
In the case of acquired inferences, we will restrict ourselves to a special class 
of acquiring processes by which the general belief(s) that an inference is based 
on is acquired from non-general beliefs: we will assume that each such process 
is an inductive reasoning process. But this is just one plausible option, which 
is not meant to entail that there would not be other candidates for acquiring 
processes: e.g., as Harman[79] and Armstrong[10] argue, inference to the best 
explanation would even be the more basic and central than induction. 

In psychology, ethology, cognitive science, and of course in the philo- 
sophical tradition, inductive reasoning processes are regarded as plausible can- 
didates for the second-order processes by which general beliefs are acquired 
in natural and artificial agents; thus they are also plausible candidates for 
inference- acquiring processes. In our cat&bird example, the cat has acquired 
her general normic belief that birds normally can fiy by such an inductive rea- 
soning process. We call such processes inductive reasoning processes and not 
inferences in order to indicate that such processes are of higher order than the 
processes that we have defined as inferences in part I; therefore, we reserve the 
term ‘inference’ for a special kind of process that leads to singular beliefs only. 
Let us now deal with inductive reasoning processes in more detail. 

The term ‘inductive’ is used in the literature with different connota- 
tions: 

• sometimes, in particular in cognitive science, inductive reasoning pro- 
cesses are understood as learning processes of any kind, independently of 
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whether they are justified (in some sense of justifiedness) or not, and also 
independently of whether their output is a general belief or not 

• if ‘inductive’ is understood more specifically (but still in a broad sense) 
and with a more philosophical connotation, an inductive reasoning pro- 
cess is a deductively invalid reasoning process which is nevertheless jus- 
tified, but with respect to a weaker notion of justifiedness than strict 
truth-preservation; an inductive reasoning process could then be defined 
as any reasoning process which is inductively strong (cf. Skyrms[161], 
p.7), i.e., where (i) it is improbable that its conclusion is false while its 
premises are true, and (ii) the reasoning process is not deductively valid. 
Actually, it is not inductive reasoning processes that are usually dealt 
with in the philosophical literature, but rather inductive arguments - 
but since we are not so much interested in the latter here, we apply the 
usual terminology to the former 

• if ‘inductive’ is understood in the restricted sense of the old philosophical 
tradition, an inductive reasoning process is (i) inductively strong, and (ii) 
it proceeds from the specific to the general, i.e., from singular premise 
beliefs to a general conclusion belief 

• finally, inductive reasoning is sometimes meant to be inductive reasoning 
to scientific laws, or law-like scientific sentences only. 

We will understand ‘inductive reasoning process’ in a way that com- 
bines aspects of the first and the third notion; thus, our notion of an inductive 
reasoning process is a very general and purely causal one without any norma- 
tive dimension. Inductive reasoning processes in the sense of inductively strong 
reasoning processes will be called ‘justified inductive reasoning processes’. Addi- 
tionally, we restrict inductive reasoning processes to only those processes which 
lead directly or indirectly from singular beliefs to general beliefs, whereas it is 
inferences that lead to singular beliefs. Finally, it follows from our low-level 
approach that we are not primarily interested in the reasoning processes that 
lead to (beliefs in the truth of) scientific laws. 

The considerations concerning the generality problem have taught us 
that it might be not so clear by which process a certain general belief has been 
acquired. Let us compare this problem to the way in which we have defined in- 
ferences in chapter 3: the inference from a[a] to f3[a] is a set of parameter-setting 
trajectories meeting certain constraints among which there is the condition that 
the trajectories lead from the belief/total belief of a[a] to the belief of /?[a]. In 
a similar way, we might define the inductive reasoning process which leads to 
the general belief of a[x] — ^ P[x], or a[x] P[x], as the set of trajectories 
that satisfies certain conditions. As the inputs of such processes we might ei- 
ther consider sequences of belief states, or sequences of total belief states. Such 
sequences would cause A to generalize from previous experiences by adopting 
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the general belief that [a[x] /?[x]/a[x] (3[x] is true]. An explication of the 

notion of inductive reasoning process could thus be given if only the list of fur- 
ther conditions were available by which we would additionally judge processes 
as performing inductive reasoning. Since a comprehensive theory of induction is 
beyond our aims (and capabilities), we will fortunately not have to look for such 
a thorough explication of what an inductive reasoning process is. Furthermore, 
we will not search for a general theory of justification for induction, which is 
convenient in view of the well-known complexity of this issue and the notorious 
difficulties that would be involved (Hume’s classic problem of induction, the 
new problems including Goodman’s new riddle of induction, etc.). However, 
there is one aspect of inductive reasoning processes which we will indeed focus 
on: it seems to be a plausible hypothesis that inductive reasoning processes 
are often composed of special cognitive subprocesses, i.e., learning processes 
and reasoning processes.^ Let us consider inductive reasoning processes from 
sequences of perceptual beliefs (which are singular as we have seen) : if a natural 
cognitive agent of sufficient complexity - like the cat in our cat&bird example 
- either has (i) the singular perceptual belief that [a[a] A/3[a] is true] or (ii) the 
singular perceptual belief that [-io;[a] is true] over and over again, but she only 
rarely has the singular perceptual belief that [a [a] A “>/?[«] is true], and if some 
additional conditions are satisfied, then the agent will normally learn from her 
experiences and adopt the general belief that [a[x] => j3[x] is true], depending 
on what it meant by ‘over and over again’ and by ‘some additional constraints’. 
Analogously for a[x] — ^ /3[x], but here any exception to a[x] j3[x] has to be 
excluded. This kind of learning process is based on recurrence and seems to 
be processed rather independently of the other beliefs - in particular, of the 
general beliefs - that the agent has additionally. But perhaps not every general 
belief of a natural agent is acquired in precisely that way: natural agents are 
usually supposed to depend on the availability of general beliefs that have not 
been acquired by learning from perceptual experiences but rather by reason- 
ing from other general beliefs that they already have. Such reasoning processes 
are not based on recurrence, although their inputs may be based on recurrent 
experience. We call such reasoning processes ‘(first-order) monotonic’ if they 
lead from strict general beliefs to a further strict general belief, and we call 
them ‘(first-order) nonmonotonic’ if they lead from a set of general beliefs that 
contains at least one defeasible general belief to a further defeasible general 
belief. E.g., in the usual symbolic computation agents, learning processes alone 
would only lead to highly incomplete sets of general beliefs; these sets may be 
supplemented by monotonic or nonmonotonic reasoning processes. The larger 
the set of general beliefs is that such an agent is supposed to have, the more 
likely it is that the agent has created those beliefs by reasoning. Similarly to 



§Many of the artificial (in particular: symbolic computation) agents that have been de- 
signed up to now in order to generate inductive reasoning processes, also seem to satisfy such 
a compositionality principle. 
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inferences from singular beliefs, a reasoning process may itself proceed from 
a set of general beliefs or from a total set of general beliefs: in the first case 
we call the reasoning process ‘(second-order) monotonic’, in the second case 
‘(second-order) nonmonotonic’. While learning processes seem to be unrenoun- 
cable for the inductive generation of general beliefs, reasoning processes are 
only auxiliary processes that an agent may or may not use in order to generate 
general beliefs from other general beliefs. 

In the following we assume that only such second-order processes take 
place in A that are of the kind we have just described , and we also restrict A’s 
second-order processes in further respects: 

• (Assumptions on A’s Inductive Reasoning Processes) 

— A’s second-order processes are (i) either learning processes from se- 
quences of (perhaps total) perceptual and thus singular beliefs to 
general beliefs, or (ii) monotonic or nonmonotonic reasoning pro- 
cesses from the general beliefs produced by former learning processes 
or by former monotonic/nonmonotonic reasoning processes to fur- 
ther general beliefs, or (iii) inductive reasoning processes, i.e., com- 
positions of (i) and (ii), where (ii) is maybe just the “zero” process - 
therefore, we may also call learning processes themselves ‘inductive 
reasoning processes’ 

— the contents of both the premise beliefs and of the conclusion belief 
of a single monotonic reasoning process may be expressed by uni- 
versal conditionals, i.e., we restrict ourselves to monotonic reasoning 
processes that are also deductive reasoning processes; thus, despite 
the fact that we are going to state general definitions for an arbi- 
trary strict one may always think of ^ as being replaced by — ^ 
as far as A is concerned 

— the contents of both the premise beliefs and of the conclusion be- 
lief of a single nonmonotonic reasoning process may be expressed 
by either (i) high probability conditionals, and - if there are such 
premise beliefs at all - by universal conditionals, s.t. the content of 
the conclusion belief is in any case expressed by a high probability 
conditional, or by (ii) normic conditionals, and - if there are such 
premise beliefs at all - by universal conditionals, s.t. the content 
of the conclusion belief is in any case expressed by a normic con- 
ditional. This is just in order to constrain the plenitude of possible 
kinds of reasoning processes 

— A’s second-order processes are basic: if they were not, we would have 
to deal with the justification of the processes by which A acquires 
her reasoning processes. By the latter assumption, we may neglect 
such complications 
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— we neglect the case of second-order nonmonotonic reasoning pro- 
cesses, for the sake of simplicity, except for a few remarks. 

Let us introduce some notations for learning process ascription and 
reasoning process ascription (but we omit notations for A’s dispositions to 
initiate such processes): 

Let (s(t — At), . . . , 5(t)) be a sequence of parameter-settings; let 
be an arbitrary strict conditional, => an arbitrary defeasible conditional; let 
'^mon, "^non, ^ind be fixed binary connectives - we extend our formal 
language by the former connectives as may be seen below. 

We say: 

• (Notation for Learning Process Ascription I) 

(s(t — At ), . . . , s(t)} h a[a]AP[a] a[x] P[x] if and only if A acquires 
in {s{t — At ), . . . , s{t)) by a learning process the strict belief that [a[x] 
P[x] is true] from having in at least one parameter-setting from s{t — At) 
to s{t) either (i) the singular belief that [a[a] A j3[a] is true] or (ii) the 
singular belief that [“>Q^[a] is true], while having in no such parameter- 
setting the singular belief that [a[a] A ^/3[a] is true]. 

• (Notation for Learning Process Ascription II) 

{s{t — At ), . . . , s{t)) 1= a[a]AP[a] a[x] => P[x] if and only if A acquires 
in {s{t — At), . . . ,s{t)) by a learning process the defeasible belief that 
[a[x] => p[x] is true] from having in most parameter-settings from s{t — 
At) to s{t) either (i) the singular belief that [a[a] A/3[a] is true] or (ii) the 
singular belief that [-io;[a] is true], while having only very rarely, if so at 
all, the singular belief that [a[a] A -i/3[a] is true]. 

We will state this more succinctly as: 

{s{t — At), . . . ,s{t)) N a[a] A /3[a] a[x] j3[x] if and only if A 
acquires a[x] P[x] in {s{t — At), . . . ,s{t)) by a learning process from a[a] A 
0[a]; 

{s{t — At), . . . , s{t)) 1= a[a] A p[a] a[x] ^ f3[x] if and only if A 
acquires a[x] => f3[x] in {s{t — At ), . . . , s{t)) by a learning process from a[a] A 
/?[o]. 

The essential difference between ‘. . . a[x] /3[xy and ‘. . . 
q;[x] => P[xY is that for the first kind of learning process, which leads to a strict 
general belief, the agent is assumed not to have believed any exceptions instants 
of a[x] P[x], whereas for the second kind of learning process, which leads to 
a defeasible general belief, the agent is just assumed not to have believed many 
exceptions to a[x] /3[x] (granted the vaguenes of ‘many’). 

Furthermore, we use the following terminology: 
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• (Notation for Monotonic Reasoning Process Ascription) 

{s(t At), . . . , s(t)) N Q;q[x] ^ /3q[x], . . . , 0^77, [x] ^ '^mon ^ 

/3[x] if and only if A acquires in {s{t — At), . . . , s{t)) by a monotonic 
reasoning process the strict belief that [a[x] /3[x] is true] from believing 

in s{t — At) that [ai[x] Pi[x] is true] for all i G {0, . . . , n}. 

• (Notation for Nonmonotonic Reasoning Process Ascription) 

{s{t- At),...,s{t)) N aoN (3o[x\,...,an[x\ f3n[x] -^non 

a[x] ^ !3[x\ if and only if A acquires in (s(t — At ), . . . ,s{t)) by a non- 
monotonic reasoning process the defeasible belief that [o;[x] (3[x\ is 

true] from believing in s{t - At) that [ai[x] /3i[x] is true] for all 

i G {0, . . . , n}, where at least one of the conditional premises is defeasible, 
i.e., of the form ai[x] => Pi[x]. 

Put slightly more shortly: 

{s{t At), . . . , s(t)) N Q;o[^] ^ /^o[^]? • • • 5 [^] ^ /^n[^] '^mon ^ 

p[x] if and only if A acquires a[x] P[x] in {s{t - At ), . . . , s{t)) by a monotonic 
reasoning process from o;o[x] /3q[x], ... , an[x] Pn[^h 

{s{t At) , • . • , ^{t)) OLq [x] /^O [^] 5 • • • 5 [^] ^ ^ '^mon 

a[x] ^ /3[x] if and only if A acquires a[x] ^ p[x ] in {s{t- At),...,s{t)) by a 
nonmonotonic reasoning process from ao[x] f3o[x], ... , an[x] /?nN- 

Finally, we have: 

• (Notation for Inductive Reasoning Process Ascription I) 

{s(t At ) , ... , s{t)) N Q;o [u] A /?o [^] 5 • • • 5 [^] A (3ji [u] '^ind Ol\x\ ^ /^[^] if 

and only if A acquires in (s(t — At), . . . , s{t)) by an inductive reasoning 
process the strict belief that [a[x] P[x] is true] from having in at 
least one parameter-setting from s{t — At) to s{t) either (i) the singular 
belief that [ai [a] A Pi [a] is true] or (ii) the singular belief that [a] is 
true], while having in no such parameter-setting the singular belief that 
[ai[a] A -^Pi[a] is true], for every i G {0, . . . , n}. 

• (Notation for Inductive Reasoning Process Ascription II) 

{^s{t At) , ... , ^(t)) [^] A Pq [u] , • • • , O-n [^] A Pn [u] '^ind Q:[^] P\^\ if 

and only if A acquires in {s{t — At), . . . , s{t)) by an inductive reasoning 
process the strict belief that [a[x] ^ P[x] is true] from having in most 
parameter-settings from s{t — At) to s{t) either (i) the singular belief that 
[ai [a] A Pi [a] is true] or (ii) the singular belief that [-la^ [a] is true] , while 
having only very rarely, if so at all, the singular belief that [apa] A ^Pi[a] 
is true], for every z G {0, . . . , n}. 
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Or, abbreviated: 

(s(t- At),...,s(t)} \= ao[a] A/?o[a], . . . , an[a] A Pn[o] '^ind c^[x] P[x] 
if and only if A acquires a[x] P[x] in {s{t — At), . . . , s{t)) by an inductive 
reasoning process from ao[a] A /?o[a], • • • , (^n[o] A /?nH; 

{s{t- At),...,s{t)) b ao[a] A/?o[a],...,anN A/?nN -^ind Oc[x] P[x] 
if and only if A acquires a[x] => P[x] in {s{t — At), . . . , s{t)) by an inductive 
reasoning process from ao[a] A /?o[a], . . . , an[a] A /?nH- 

For the reasons indicated above we will not give definitions for 
^mon^ ^non^ but we will rather consider them as basic undefined terms. How- 
ever, we can define ^ind in terms of ^mon, ^non'- 

Definition 43 (Inductive Reasoning Process Ascription) 

1. {s(t - At), . . . ,5(t)) N ao[a] A /?o[a], . . . ,On[a] A PnW] ^ind ol[x] P[x] 

'^ff 

there are amounts Aot, Ait of time, s.t. t — At = t — Aot — Ait and 

(a) (Learning) 

for every i € {0, . . . , n}; 

(s(t - Aot - Alt), ...,s{t- Alt)) N a/a] A P/a] a/x] P/x] 

(b) (Monotonic Reasoning) 

(5(t- Ait),...,s(t)) N ao[x] Po[x],...,an[x] Pn[x] ^mon 

a[x] p[x] 

2. {s{t - At ), . . . , s{t)) N ao[a] A po[a ], . . . , A pn[a] ^ind ol[x] ^ p[x] 

iff 

there are amounts Aot, Ait of time, s.t. t — At = t — Aot — Ait and 

(a) (Learning) 

for every z G {0, . . . , n}; 

{s{t - Aot - Alt), ...,s{t- Alt)) N a/a] A P/a] a/x] P/x] 

(b) (Nonmonotonic Reasoning) 

{s{t- Alt),..., s{t)) ^ ao[x] ~^/=> Po[x],. . . ,an[x] Pn[x] ^non 

a[x] ^ P[x\. 

We do not exclude the case where n = 0, ao[a\ = a[a], and Po[a] = P[a\: 
in such a case the inductive reasoning process coincides with its subordinate 
learning process. For simplicity, we have assumed that first all learning pro- 
cesses take place if an inductive reasoning process takes place, and only after- 
wards the monotonic/nonmonotonic reasoning process. 

We forgo any definitions of learning processes, monotonic/nonmonotonic 
reasoning processes, and inductive reasoning processes as sets of sequences of 
parameter-settings . 




118 



An Informal Account of Our Theory of Justified Inference 



Example 44 (The Bird Example Reconsidered) 

A has observed birds for some period of time. By a learning process, 
she has formed the general belief that [Bird[x] =>nor CanFly[x] is true] from 
having the perceptual belief that [Bird[a]/\CanFly[a] is true] at a great number 
of time instants (notwithstanding exceptions). Furthermore, again by a learn- 
ing process, she has produced the general belief that ]Penguin[x\ =^nor Bird[x]A 
-^CanFly[x] is true] from having the perceptual belief that ]Penguin[a]A 
Bird[a] A -^CanFly[a] is true] at a great number of time instants. Now, a non- 
monotonic reasoning process takes place in A which leads from the general 
belief that ]Bird[x] ^nor CanFly[x] is true] together with the general belief 
that ]Penguin[x] ^nor Bird[x] A~^CanFly[x] is true] to the further general be- 
lief that ]Penguin[x] A Bird[x] ^nor ^CanFly[x] is true]. This latter general 
belief is caused directly by the two former general beliefs, but only indirectly by 
the sequences of perceptual beliefs referred to above. Summing up, an inductive 
reasoning process takes place in A from a sequence of perceptual beliefs to a 
general belief. 

Example 45 

1. (In terms of computers employing symbolic computation) 

Inductive reasoning processes may be implemented by symbolic computa- 
tion in the following way: (i) learning processes: from a set of singular em- 
pirical data a set of general sentences is produced and is ‘^written” into a 
conditional knowledge base; (ii) monotonic/nonmonotonic reasoning pro- 
cesses: further general sentences are generated by applying inference rules 
to the general sentences generated in (i). In the case of a so-called expert 
system, there is no learning process going on within the system, but the 
primary set of general sentences is rather put into a conditional knowl- 
edge base by some human experts, s.t. these general sentences are the 
outcomes of inductive reasoning processes that have been going on in the 
human experts. However, monotonic/nonmonotonic reasoning processes 
are indeed carried out by the expert system by applying inference rules to 
the general sentences in the conditional knowledge base 

2. (In terms of connectionist networks employing state-transitions) 

An inductive reasoning process may be implemented within artifical neural 
networks in the following way: from a set of singular empirical data a set 
of input- output pairs is generated where each component of such a pair is 
a set of nodes; a learning algorithm is initiated by which the weights of 
those connections are increased that (directly or indirectly) connect some 
nodes of the input component of a pair to some of the nodes of the output 
component of the same pair. This inductive reasoning process combines 
a learning process with an implicitly contained monotonic/nonmonotonic 
reasoning process, since also connections are ^‘strengthened’^ that do not 
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immediately connect nodes within the originally computed input-output 
pairs ( and general beliefs correspond to the states of the totality of network 
connections) . 

In the chapter 8 we will define what it means to say that A draws 
a justified inference. In the case of acquired inferences, our definition of the 
notion of justified inference will remain incomplete, since we do not have a 
corresponding definition for justified learning processes, and thus not for jus- 
tified inductive reasoning processes. However, a definition of what it means 
to say that a justified second-order monotonic (but first-order monotonic or 
nonmonotonic) reasoning process takes place in A, i.e., a justified reasoning 
process that leads from a set of general beliefs to a further general beliefs, is 
implicitly contained in our def.52 and 54. 

In part IV we will show in detail how justified basic inferences may be 
produced either by symbolic computation or by state-transitions in networks; 
moreover, we are going to sketch how second-order monotonic, justified, and 
first-order monotonic or nonmonotonic reasoning processes may be produced 
by symbolic computation (in networks such processes will be seen to be super- 
fiuous). We would be able to show how the members of a large class of (fully) 
justified acquired inferences might be produced either by symbolic computation 
or by state-transitions in networks, if we were given that the learning processes 
involved are justified. However, we have no theory to offer of how to generate 
justified learning processes in a cognitive architecture. 

The two open problems of how to define and to generate justified learn- 
ing processes constitute, roughly, the classical problem of induction; we have 
only stated the problem in cognitive terms and not in the purely syntactical or 
logical terminology that is usually applied in the philosophy of science. 

6.3.3 The Ascription of Justified Inference 

We have not yet clarified in which way we should ascribe justified inferences to 
agents. Similar to the case of justified-belief ascription, we can speak of an infer- 
ence of A as being justified only relative to (i) a trajectory {s{t — At), . . . , s{t)) 
of parameter-settings of A in which A draws the inference, and to (ii) a trajec- 
tory {s{t))^^j^ of A {In is again some index set), s.t. s{t — At), . . . , s{t) is the 
subsequence of parameter-setting of A from t — At up to t in the trajectory 
(s(t))^^j^, and where {s{t))^^j^ is the cognitive history of A. This relativiza- 
tion is necessary for the following reason: according to our explication in part 
I, inferences are mental processes, i.e., sets of trajectories of parameter-settings 
of our cognitive agent A. But we have just seen that the justifiedness of such 
processes depends on whether the processes are basic or acquired, and - in 
the latter case - by which further processes they have been acquired. Since 
an inference process is only acquired at some time t', i.e., since the general 
belief(s) that the inference drawn in {s{t — At ), . . . , s{t)) is based on are ac- 
quired at the time t' before t — At, s.t. not all of the latter beliefs have been 
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held by A before t', but they are all held by A after t', also the justifiedness 
of an inference is relative to the cognitive history of A; this is due to the fact 
than an inference drawn in {s{t — At), . . . ,s{t)) is only acquired relative to a 
trajectory {s(t))^^j^. An inference drawn in from time ^ — At to ^ is defined 
to be basic relative to a trajectory if it is not acquired relative to 

the trajectory. An inference from a io (3 might be drawn justifiedly from time 
t — At to t relative to a trajectory (<s(t))^^^^, but not relative to another trajec- 
tory (for {s{t — At), . . . , s{t)) being identical to the subsequence of 

from t — At to t). E.g., the inference could be unreliably aquired 
in (<5*(t))^^j^* , but basic in {s(t))^^j^\ or the inference could be acquired in 
^ second-order reliable process, but not in . 

Moreover, similar to the case of justified beliefs, we opt for ascribing 
justified inferences to A by means of a justified-inference operator (we will use 
an operator J), rather than by a justifiedness predicate that is applied to names 
of inferences. Wherever we call A’s inference process from a to /3 drawn from 
time t — At to t justified relative to the cognitive history {s{t))^^j^, this is 
understood as being synonymous to: A justifiedly infers j3 from a from time 
t — At to t relative to the cognitive history or: it is justified relative 

to the cognitive history {s{t))^^j^ that A infers from a to P from time t — At 
to t. 




Chapter 7 

A DISCUSSION OF RELIABILITY 



7.1 First-Order Reliability: Intuitions 

An inference process has been said to be (first-order) conditionally reliable if 
and only if it tends to produce more true conclusion beliefs than false conclusion 
beliefs given that its premise beliefs are true; in the following we will usually 
drop the qualification ‘conditionally’ when we refer to conditional reliability, 
since we are going to focus only on conditional reliability anyway. Thus, we say 
that it reliable for A to infer from a to /3 if and only if this inference tends to 
produce more true beliefs in the truth of j3 than false beliefs in the truth of /3, 
given that the belief in the truth of a is true. 

Recall from chapter 2 that sentences like a and f3 contain the indi- 
vidual constant a, which denotes at a time t the object standing before A. 
Thus, the truth values of a and /3 may vary both with the time t, and with 
the place that A is situated in at t, i.e., with the object 0{t) (referred to in 
chapter 2) that is in front of A at time t. Consequently, the actual reliability 
of an inference depends on the given (actual) sequence O. Put formally: let 
a, /3 e C; let tr^ {P\(^) — card {0 \ {3acti 0{t)) t= a A /?}/ card{0 

^ I (^act?0(t)) N q;} if there is a t, s.t. 0 ^ t ^ n and {3act^O{t)) N a; let 
tr^{/3\a) = 1 otherwise. tr^{P\a) is the truth ratio up to n for (3 given a 
relative to O, i.e., the proportion of instants of time in which j3 is true in the 
sequence O up to time n, given that a is true. If the limit lim tr^ {/3\a) exists, 

n-^oo 

we call it the truth ratio of the inference from a to P relative to O; the truth 
ratio of an inference relative to O is nothing but a limit of relative frequencies, 
and it is the degree of reliability (or: “the” reliability) of the inference relative 
to O. If the truth ratio is “high” relative to O, where O is the sequence of ob- 
jects that A actually “meets” , then the inference from a to P might be called 
reliable. 

We have thus introduced a metalinguistic reliability operator ‘it is re- 
liable that’ or ‘it is reliable to’ that expresses reliability relative to the given 
sequence O of objects, where 0{t) is the object encountered by A at t. But 
O might be an improbable sequence of objects, or an abnormal sequence of 
objects, presupposing some given notions of probability and normality. In such 
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a case, it might be reliable to infer from a to /3, although lim tr^{p ja) were 

n—^oc 

not high. E.g., although it is reliable for A to draw the basic nonmonotonic 
inference from Bird{a) to CanFly{a), since normal birds can fly in the terri- 
tory which A is situated in, bad luck has it that every bird in the sequence O 
is actually a penguin. If reliability were deflned by the truth ratio from above, 
then this inference would be first-order unreliable and thus unjustified, which is 
counter-intuitive. Furthermore, the limit lim tr^{l3\a) might not even exist, 

n—^oo 

and whether A justifledly infers from o; to /? at t, would depend on the entries 
of the sequence O after t, i.e., on future prospects. Thus we should rather aim 
at defining the first-order reliability of an inference differently. However, the 
definiens that we are looking for should be such that if the inference from a to 
j3 is reliable, and if O were the actual sequence of objects encountered by A, it 
would be improbable that tr^{j3 |o;) was not “high” in the long run. Otherwise, 
a reliable inference would not really be truth-conducive. 

This can be achieved in two ways: by using a quantitative, i.e., a nu- 
merical notion of reliability, or by using a qualitative, i.e., a non-numerical 
notion of reliability - the difference will be explained in details below. Each of 
the two kinds of reliability may be demanded to have a maximum degree of 
strength, or just a high degree of strength: in the first case we speak of absolute 
reliability, in the second case we speak of high reliability. Absolute reliability 
entails high reliability, but not necessarily vice versa. 

In order to assess the possible versions of reliability that we are going to 
deal with, we need some criteria of quality that we can use in order to abandon 
inadequate notions of reliability. 



7.2 Quality Criteria for Notions of First-Order Reliability 

We postulate the following criteria: 

• (Quality Criteria for Notions of First-Order Reliability) 

If a notion of first-order reliability for inferences is adequate then 

— (Correctness) 

the set of reliable inferences is not “too large”, or equivalently, the 
notion of reliability is not too weak: in particular, it should be im- 
probable that the truth ratio of a reliable inference process relative 
to O is smaller than 0.5, and thus, the risk of failure that is associ- 
ated with every concrete application of the inference should not be 
too high 

— (Power) 

the set of reliable inferences is not “too small” , or equivalently, the 
notion of reliability is not too strong: the set of reliable inferences 




Quantitative Notions of First-Order Reliability 



123 



should not be selected too “cautiously” , it should not be constrained 
to trivial cases only 

- (Feasibility, Tractability) 

the set of reliable inferences only contains inferences that may be 
generated within a practically sensible amount of time. This may 
be put more precisely in a great variety of ways; in a rather de- 
manding version, we could postulate that the time complexity of 
such reliable inferential processes should - in the ideal case - be of 
polynomial order relative to some appropriate complexity measure 
that is applicable to premise beliefs and conclusion beliefs 

— (Simplicity) 

the set of reliable inferences only contains inferences that may be 
generated also by low-level agents, i.e., agents of a low cognitive 
complexity. More specifically, the set of reliable inferences should be 
of such a kind that it contains large subsets of reliable inferences 
that may be generated collectively (but of course not necessarily 
simultaneously) by a single low-level agent. 

Correctness defines a lower reliability boundary for high reliability in- 
ferences. Power ensures the usefulness of a notion of reliability. Feasibility (or 
tractability) secures its practical applicability. Simplicity guarantees that our 
intended notion of reliability leads to a notion of justifiedness for inferences 
which satisfies the low-level postulate from section 6.2. Feasibility and Sim- 
plicity, of course, cohere; Power interferes with both of them. The term ‘large’ 
that we have used for the statement of Simplicity is vague but, as we will see, 
sufficiently specified in order to make Simplicity informative; ‘large’ is not only 
meant to express quantity, but it should also be understood in the way that a 
relevant, interesting, and representative set of reliable inferences ought to be 
producable by a low-level agent. 



7.3 Quantitative Notions of First-Order Reliability 

Let us first deal with quantitative notions of first-order reliability. A quantita- 
tive account of reliability states truth-conditions for ‘it is reliable that . . .’ in 
a way that makes use of numerical values; the typical case is a probabilistic 
definition of reliability of the following form: it is absolutely reliable to infer 
from a[a] to P[a] iff the conditional probability Probact{P[^] \ o[x]) of p[x] given 
a[x] is 1. It is highly reliable to infer from a[a] to /3[a] iff the conditional proba- 
bility Probact{P[x] \ o;[x]) of /3[x] given a[x] is larger than 1 — e, where e is some 
fixed small real number, s.t. e is smaller than 0.5. The vague qualification of a 
probability as being high is eliminated in the latter case in favour of a precise 
and numerically specific account. 
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Probact is the probability measure which is associated with our episte- 
mological sandbox territory, s.t. Prohact{^[x\) is the probability that A encoun- 
ters an object d with {3 act, d) N (f[a]. We regard Probact as an objective prob- 
ability measure that is applied to open formulas (p[x] where (f[a\ G £; semanti- 
cally, Probact is defined on a cr-algebra of subsets of the domain Dact of objects. 
Since Dact has been assumed finite in chapter 2, we can simply identify this 
cr-algebra with the powerset algebra on Dact right from the start (see section 
9.2), without taking for granted any problematic probabilistic presumptions. 
P'^obact is not a subjective or epistemic probability measure, s.t. Probact{^[oi\) 
would be the rational degree of belief associated with the belief that \^[a] is 
true] for some time t, because otherwise the postulate Correctness from above 
would not be guaranteed to be satisfied. Furthermore, Probact{^[P[) is not nec- 
essarily assumed to be identical to lim tr^{(f[a] |T) from above, where T is 

71—^00 

the logical verum; rather, we regard ^Probact' as a theoretical term that is not 
defined at all but implicitly characterized by a set of axioms (including the Kol- 
mogorov axioms of probability, of course). Sentences involving ^Probact ascribe 
propensities, i.e., probabilistic dispositions, to classes of entities. 

We use a conditional probability in order to match the conditionality of 
the notion of reliability that we are after. E.g., if we defined absolute reliability 
in the way that it would be absolutely reliable to draw an inference from a[a] 
to (3[a] iff if Probact{oL[x\) = 1, then also Probact{(d[x\) — 1, this latter notion 
of reliability would be much too weak: all inferences from formulas with a 
probability smaller than 1 to any formula - including formulas with very low 
probability - would turn out to be absolutely reliable, and thus Correctness 
would be violated. 

If reliability is defined quantitatively as above, exceptions to the uni- 
versal conditional a[x] — > j3[x\ are tolerated, s.t. the truth of the conclusion 
of a reliable inference is guaranteed in all cases in which the premise is true 
- with the possible exception of a neglect able number of cases. In the case of 
absolute quantitative probability, the set of exceptions is demanded to be a 
zero set, i.e., a set with probability measure 0. In such a case, the inference 
from a[a] to p[a] is certain despite the presence of possible exceptions, and this 
suffices to call such a kind of reliability ‘absolute’. High conditional probability 
should not be mixed up with certainty in the latter case, since if the conditional 
probability Prob{f3[x]\ a[x]) is high, the set of exceptions to a[x] /3[x] may 
have non-zero^ but numerically small probability. 



7.4 Qualitative Notions of First-Order Reliability 

A qualitative account of reliability states truth-conditions for ‘it is reliable 
that . . .’ in a way that avoids the use of precise numerical values; the typical 
cases are definitions of absolute reliability by universal general conditionals, 
and of high reliability by either qualitative high probability conditionals or 
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normic conditionals. For absolute reliability, we define: it is absolutely reliable 
to infer from a[a] to P[a] iff ^act ^ (^[x] (3[x] (recall the hidden universal 

quantifier in a[x] P[x]). In the case of high reliability we have: (probabilistic 
version) it is highly reliable to infer from a[a] to P[a] iff N a N ^hp P[x], 
i.e., the conditional probability Probact{0{x]\a[x]) of (3[x] given a[x] is high, 
or, synonymously, (by far the) most as are /3s. (Normic version) It is highly 
reliable to infer from a[a] to (3[a] iff ^ =>nor l3[x], i.e., normal as are 
(3s. 

and are models that satisfy universal conditionals, 
high probability conditionals, and normic conditionals, respectively; each of 
the models represents an aspect of the territory that >1 is a part of. We will 
focus on such kinds of models in detail in part III. From now on let us plainly 
write ‘OTact’ instead of consider dJlact in the 

way that it incorporates components in virtue of which universal conditionals, 
high probability conditionals, and normic conditionals are satisfied, such that 
each such component is given in correspondence to our epistemological sandbox 
area. 

In none of the definitions sketched above any reference is made to 
numerical values. In particular, the vague characterization of a probability as 
being “high” is preserved, and we will thus have to clarify the semantics of 
^hp P[x] in more detail. In any way, we shall not define that dJtact 
o^[x] 0[x] in the way that Probact{P[^] \ c^[x]) is larger than 1 — e for some 

small real number e, since this would not do justice to the vague term ‘high’ 
according to a qualitative approach. 

Let us now deal more closely with the kind of modal, or count erfactual, 
notion of normality that we use here. For the same reasons as in the probabilistic 
case, a normality operator is applied to open formulas, and it is interpreted 
objectively. This is in line with Schurz[154], section 1, where an overview on 
various normality notions may be found. A subjective notion of normality was 
by Goldman [69] in one version of his reliabilist theory of justification (see the 
second chapter in the appendix, in particular, section 22.5.3), but he had to 
abandon his account for precisely the subjectivity of this notion. In psychology, 
subjective normality has been dealt with in the context of the ‘prototype theory 
of concepts’ (e.g. by Rosch[136]): e.g., our mental bird concept is supposed to 
be identical to a set of “prototypical” bird representations, s.t. each of these 
prototypes is represented as being able to fly; concepts are acquired by acquiring 
prototype representations. In the field of Artificial Intelligence normality is 
often interpreted, e.g. by McCarthy [104], [105], in terms of communication 
conventions, s.t., ‘normal birds can fly’ means: if I tell you that something is a 
bird, then I always want you to understand this in the way that I actually refer 
to something that can fly; I will tell you otherwise. Similar conventions are to 
be used when, e.g., a database is created. Each of these subjective versions of 
normality is unacceptable as an explication of the kind of objective reliability 
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according to which the cat’s inference in the bird example is reliable, s.t. her 
inference is at least primarily justified. Goldman’s failure of incorporating a 
subjective notion of reliability into an externalist theory of justification is going 
to be avoided by our using an objective version of reliability. 

We have reserved the term ‘normality’ strictly for a modal notion of 
normality, and we do not speak of normality with a probabilistic connota- 
tion: where Schurz[154], and also Reiter [132] and Pearl[118], refer to ‘statis- 
tical normality’, we rather say ‘high probability’; Schurz’s (and the other’s) 
term ‘prototypical normality’ corresponds to our ‘normality’ simpliciter. We 
want to avoid the term ‘prototypical normality’ since ‘prototypical’ is itself 
ambiguous, as we have already seen above when we cited Rosch’s theory. Pro- 
totypes are often referred to in the context of conceptual space representations 
(see Gardenfors[59j), i.e., they have a certain geometrical interpretation, and 
this interpretation differs from our conception of normality. Prototypical as are 
thought of as those as which are “central” to the “region” of as in a conceptual 
space and which somehow represent the whole region in precisely that way. The 
prototypicality of an object with respect to being an a is measured by the “dis- 
tance” of the object from the a-prototypes. On the other hand, normal as, as 
we understand it, and as it is usually understood, are those as which are “most 
normal” according to a general normality ordering given independently of a; 
put geometrically, the absolute normality of an object is measured by the “dis- 
tance” between the object and the absolute zero vector of maximum normality. 
This is the driving intuition behind the usual semantics for normic sentences, 
which validates precisely those rules of inference for normic sentences that also 
seem to be intuitively valid (we will deal with this semantics in part III). It is 
easy to see that, under the given geometrical interpretations, conditionals of 
the form ‘prototypical as are /?s’ conform to a diflFerent logic than conditionals 
of the form ‘normal as are /3s’. Indeed, conditionals of the former kind do not 
seem to conform to any interesting recursively enumerable set of logical rules 
at all*, whereas the normic conditionals of the second kind indeed do, as we will 
see in part III. Therefore, we shall understand normality not in the prototypical 
sense that we have just described but in the sense of normic conditionals like 
‘normal as are ^s’; we also do not speak of ‘prototypical normality’ but just 
of ‘normality’ simpliciter. Moreover, the geometrical interpretation of normal 
as as those objects that are “closest” to a point of maximum normality is per- 
haps too restrictive: objects - in our case, the objects in Dact ~ are intuitively 
ordered according to their normality, s.t. one object may be more normal than 
another: e.g., a bird that is able to fiy may be more normal than some bird that 
is not able to fiy. But this kind of normality order is not necessarily meant to 
be isomorphic to a numerical order according to degrees of normality, like the 
distances from a zero vector in a norm space. Rather, Tlact ^ =^nor if 
and only if the most normal as (according to the order -<act defined by ^)Jlact) 

*This has been pointed out to me by W. Rabinowicz (personal communication). 
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are /3s, where an object is a most normal a-object if it satisfies a and there is 
no further object that also satisfies a while being more normal; the underlying 
order relation of one object being more normal than another object may be 
subject to different constraints, but not necessarily subject to the constraint of 
being numerical. 

The so-defined normic notion of reliability is again a conditional one, 
since we refer to the most normal objects among those objects that satisfy 
a. As Kraus et al.[85], p.l73, point out, the binary operator =>nor cannot be 
replaced by a combination of a unary normality operator N and the material 
implication sign, s.t. a[x] =^nor l3[x] would be analyzed as N{a[x] l3[x]) or 
a[x] Nj3[x].^ 



7.5 Comparison of Quantitative and Qualitative Reliability w.r.t. 
Strength 

Before we evaluate these quantitative and qualitative reliability concepts ac- 
cording to the quality criteria from our list above, let us first compare the 
different notions concerning their logical relationships: obviously (presuppos- 
ing that the model dJl that involves a probability measure Proh and a normality 
order as its components is construed realistically, i.e., matches our epistemo- 
logical sandbox territory), 

• absolute qualitative reliability entails absolute quantitative reliability, i.e., 
if Mact N o;[a:] -> (3[x] then Probacti I3[x] \ a[x]) = 1 

• absolute quantitative reliability entails high qualitative reliability in the 
probabilistic sense, i.e., 

if Probact{P[x]\a[x]) = 1 then Mact c^[x] =^hp P[x]- 

At least for “most”, if not for all a[x], P[x]^ we should have: 

• high qualitative reliability in the probabilistic sense entails high quanti- 
tative reliability for e = 0.5, i.e., 

if Mact ^ <y[x] =^hp p[x] then Probact{l3[x]\a[x]) > 1- e = 0.5. 

We have not yet said something about the logical strength of the qual- 
itative notion of high reliability in the normic sense. It is clear that we have at 
least: 

^However, as KLM[85] seem to have overlooked, a[x] =>nor P[x] might be analyzed as 
ATa[x] — l3[x]: such a unary normality or minimality operator N has been suggested by Van 
Benthem[172], pp.48f. But the binary =>nor definitely has a more “transparent” interpreta- 
tion. 
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• absolute qualitative reliability entails high qualitative reliability in the 
normic sense, i.e., 

if Tlact i= Oi[x] /3[a;] then Wact ^ a[x] ^nor !3[x\. 

Furthermore, we adopt Schurz’s[154] 

• (Statistical Consequence Thesis) 

high qualitative reliability in the normic sense normally entails high quan- 
titative reliability in the probabilistic sense, i.e., 

if OJlact =^nor /?[^] then normally also Tlact ^ =^hp P[x].^ 

Thus, we also have for “most”, if not for all a[x], /3[x]: 

• high qualitative reliability in the normic sense normally entails high quan- 
titative reliability for e = 0.5, i.e., 

if Tlact (^[x] ^nor ^[x] then normally Prohact{P[x] \ a[x]) > 1 — e = 0.5. 

Schurz[154] argues for the statistical consequence thesis by an elabo- 
rate “evolution theory of normic laws”: according to this theory, normic laws 
are the phenomenological laws of (biological or cultural) systems which have 
evolved in a process of evolution by (natural or cultural) selection. Evolutionary 
systems are open and self-regulatory, contrary to the systems which have been 
traditionally studied by physics. The former possess normal states - so-called 
“Soll-Werte” - which have been selected evolutionarily, i.e., the normal states 
of such systems are either those states that contribute to the system’s evolu- 
tionary fitness with a high degree of probability, or states which are substates 
of such states, or states which are caused by such states. E.g., the birds’ ability 
to fiy brings in most cases a decisive evolutionary benefit for birds; therefore^ 
normal birds can fiy.^ Systems that are not within their normal states in the 
statistical majority of cases are designated to become extinct - maybe in a long- 
term development - just because they are within states that are indifferent or 
even negative with respect to their selective properties. As Schurz[154] points 
out, evolution theory explains in this way not only why the phenomenological 
behaviour of evolutionary systems obeys normic laws, but it also explains why 
this peculiar connection between - in Schurz’s terms - prototypical and statis- 
tical normality exists at all. Birds, for instance, can normally fly. It is certainly 
possible that all birds suddenly lose their ability to fiy (perhaps due to an en- 
vironmental catastrophe). But in such a case, the species of birds will become 
extinct with a high probability after a period of evolution. As the possibility of 

^See below for Schurz’s remarks on this kind of “meta-normality”. 

^Actually, Schurz’s theory is much more complex: e.g., each claim concerning prototypical 
normality has to be constrained to a specific species at a specific period of time. We just 
omit such details for the sake of simplicity. 
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such catastrophes indicates, the implication relation between prototypical and 
statistical normality is itself not strict but normic. 

If Schurz is right - and he presents good arguments for his theory - 
then the statistical consequence thesis is true for all normic laws a[x] =^nor 
which are “the phenomenological laws of evolutionary systems” (Schurz[154], 
p.9). However, the class of such normic laws does perhaps not comprise the 
class of all normic sentences that are used to express typical general common 
sense beliefs like, e.g., ‘students normally attend demonstrations’, ‘students 
normally dislike early lectures’, and so on. Each of the latter normic sentences 
is intuitively true, just as their high probability counterparts ‘(by far the) most 
students attend demonstrations’, and ‘(by far the) most students dislike early 
lectures’ are true, but the properties expressed in the consequents of the latter 
normic and high probability claims are not contributing to the “evolutionary 
fitness” of students in the socio-cultural system that they are parts of, or so 
it seems. However, according to Schurz, the set of normal states also includes 
those states which do not themselves contribute to the system’s evolutionary 
fitness with a high degree of probability, but which which are substates of such 
states, or states which are caused by such states. E.g., students are normally 
poor: indeed, the notorious poverty of students does not contribute positively to 
their being students; but it might be justifiedly regarded as a part of a positively 
contributing state. Perhaps such an analysis is also possible in the case of the 
two examples above. We may call (after Schurz) statistical most- generalizations 
that are neither normic laws in the evolutionary sense above, nor fundamental 
laws of physics, ‘normic generalizations’. According to Schurz’s analysis, cer- 
tain common sense examples of normic ‘laws’ might not really be lawlike, but 
they might rather be mere normic generalizations. However, since such general- 
izations are simply synonymous to “statistical most-generalizations” , they still 
satisfy the statistical consequence thesis - in this case even strictly. For now we 
simply presuppose Schurz’s statistical consequence thesis as a plausible hypoth- 
esis for normic conditionals. The methodological impact of the thesis is that 
normic laws have empirical content, since they are gradually disconfirmable in 
the same way as the usual precise statistical laws. As far as a reliabilist the- 
ory of justified inference, and in particular of normic inference, is concerned, 
the thesis is important because without it we could not define high qualitative 
reliability in terms of normality, since our Correctness postulate could not be 
shown to be satisfied. As Schurz[154], p.7, expresses this: ''only if the thesis 
holds, can reasoning from normic laws be practically reliahle^\ 



7.6 Quantitative Reliability Fails Our ‘‘Low-Level” Quality Criteria 

Now let us assess the different notions of reliability with respect to the quality 
criteria from above. First of all, each of the reliability concepts that we have 
presented conforms to Correctness: if it is absolutely or highly reliable to infer 




130 



A Discussion of Reliability 



from a[a] to p[a] - independently of what notion of reliability from the above 
is employed - then Probact{P[x]\ (^[^]) > 0-5; in the case of high qualitative 
reliability in the normic sense, this is due to the statistical consequence thesis. 
Thus, by the law of large numbers. Correctness is satisfied, since it is probable 
that reliable inferences lead to high truth ratios in the long run. Any of the 
reliability concepts that we have suggested is therefore also apt for a reliabilist 
definition of primarily justified inference, s.t. first-order truth-conduciveness as 
being given by a high truth-ratio is statistically entailed. 

Moreover, also Power is satisfied, since each of the reliability concepts 
that we have dealt with leads to a large and non-trivial class of first-order reli- 
able inferences that includes: it is absolutely reliable to infer from Penguin(a) 
to Bird{a), since (first version) Probact{Pi'f'd[x]\ Penguin[x]) = 1, and also 
(second version) dJlact ^ Penguin[x] Bird[x]; it is highly reliable to infer 
from Bird{a) to CanFly{a), since (first version) Probact{CanFly[x] \ Bird[x]) > 
1 — e for some small but fixed e < 0.5, but also because (second version) ^act ^ 
Bird[x] ^hp CanFly[x] and (third version) ^act Bird[x] =^nor CanFly[x]. 
Of course, the two notions of absolute reliability are indeed cautious ones, since 
absolutely reliable inferences turn out to be exceptionless (up to probability 0 
in the quantitative case), but this is just what we intend absolute reliability to 
be like. However, neither the notions of absolute reliability nor the notions of 
high reliability are overly cautious. On the other hand, if we had defined an 
inference from a[a] to f3[a] to be absolutely reliable just in case a[x] (3[x] 
had been logically true, that would have been immoderate. Furthermore, A’s 
nonmonotonic inference from Bird{a) to CanFly{a) would of course turn out 
to be unjustified if we defined justifiedness for this inference in terms of abso- 
lute reliability - but that is a different problem which has to do with a certain 
mismatch between absolute reliability and nonmonotonic inferences that we are 
going to deal with in section 7.9. 

Obviously, Correctness and Power are rather “insensitive” to the dif- 
ferences between the various reliability notions. Therefore, we have to turn to 
the other postulates in our list of quality criteria in order to see which notion (s) 
of reliability are indeed appropriate for the kind of theory we are looking for. 

Let us consider Simplicity now: in order to be creatable by an agent, 
a reliable inference has to be based on a general belief, and the content of the 
latter belief has to be represented by the agent - not necessarily explicitly by 
means of a syntactical expression, but in some way. According to section 6.3, 
the general belief itself is either basic, or it is acquired by some learning process 
or by some monotonic or nonmonotonic reasoning process or by the composition 
of the latter kinds of processes, i.e., by an inductive reasoning process. Sim- 
plicity demands a definition of reliability in a way, s.t.: (i) a “simple” cognitive 
agent is capable of having a large set of general beliefs on which only reliable 
inferences are based, and thus that she is also able to represent the contents of 
the latter beliefs; (ii) a low-level agent is capable of learning such general be- 
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liefs second-order reliably; (iii) she is (perhaps) capable of second-order reliable 
reasoning from such general beliefs to further general beliefs; finally (iv) she 
is capable of drawing (then first-order reliable) inferences on the basis of such 
beliefs. We are now going to argue that if reliability is defined quantitatively, it 
is improbable that (i)-(iv) are satisfied. We are going to deal with second-order 
reliability more thoroughly in section 7.10, so we rather presuppose an intu- 
itive understanding of the notion of second-order reliability in the subsequent 
considerations. 

We say that is “improbable” that (i)-(iv) are satisfied because it is dif- 
ficult to prove such a claim even for a specific class of cognitive architectures, 
let alone in general; but there are some plausible indications. For instance, let 
us consider agents in the symbolic computation paradigm, i.e., agents which 
(a) represent entities - in our case the contents of intentional states - by means 
of symbolic expressions in an internal language, and where (b) their cognitive 
activities are the results of rule-based computations that are applied to the sym- 
bolic expressions referred to in (a). It might be questioned whether a symbolic 
computation architecture may be called ‘low-level’ at all, independently of what 
it looks like precisely, but let us now just see whether there are any significant 
differences between symbolic computation agents which implement quantita- 
tively reliable inferences and symbolic computation agents which implement 
qualitatively reliable inferences. First of all, how might such agents represent 
the contents of the general beliefs on which their reliable inferences are to be 
based? Let us focus on high quantitative reliability: one possible answer to the 
question raised is that such an agent might simply represent the contents of the 
general beliefs by conditionals in her internal language, s.t., say, a[x] ^ j3[x] 
is stored in a symbolic knowledge base - i.e., the symbolic computation agent 
believes that a[x] p[x] is true - only if Probact{l3[x] \ o;[x]) > 1 — e for some 
small pre-defined e.^ In this way, the precise numerical information concern- 
ing the conditional probability Probact{P[x] \ Q:[x]) is lost, but this information 
is perhaps not necessary for the generation of an inference on the basis of a 
general belief, anyway. The second question is: how would such an agent be 
able to acquire conditional expressions like a[x] j3[x] by means of learning or 
reasoning (while neglecting “innate” beliefs for the moment)? In the first case, 
the agent needs to have some primary information on the exact probability 
Probact{P[^]\o'[x]) before she can check whether Probact{0[^]\ <^[^]) >1 — 6, 
and before she can memorize/process the expression a[x] P[x] in case that 
the latter inequality is actually satisfied. Such precise probability data are pos- 
sibly hard to acquire, but let us assume that this can be achieved by a low- 
level agent. In the second case, the agent would have to reason from a set 
of such stored conditionals to further conditionals, s.t. only reliable inferences 
are based on the latter. But without any numerical information on the de- 

^Actually, we should rather say that the translation of a[x] j3[x] in the agent’s internal 
language is stored in the agent’s knowledge base ~ but let us ignore such details. 
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grees of probability of the stored expressions only rather trivial rules might be 
applied to the set of memorized conditionals in order to produce new condi- 
tionals by symbolic computation on which again only reliable inferences are 
to be based: e.g., from a[x] => P[x] the agent might reason to ^[x] => /3[x], if 
both a[x] — » j[x] and ^[x] a[x] are derivable from the agent’s knowledge 
base of universal conditionals by the rules of classical logic: this rule is usu- 
ally called ‘Left Equivalence’; or, e.g., she might reason from a[x] P[x] to 
a[x] 7[x], if P[x] y[x] is derivable from the agent’s knowledge base of 
universal conditionals by the rules of classical logic: this rule is usually called 
‘Right Weakening’. In Adams[5], pp.3f, one can find a complete list of rules that 
preserve minimum probabilities, s.t., if a rule is applied to a set of conditionals, 
the probability of the reasoned conditional is at least as high as the minimum 
of the probabilities of the given conditionals; if the latter minimum is larger 
than 1 — e, it is thus preserved for the reasoned conditional. This system of 
rules is quite weak and lacks many of the rules that one would intuitively like 
to have; put differently, it is overly cautious. Furthermore, the system is sound 
and complete with respect to minimum probability preservation, but not with 
respect to “> 1 — e” preservation for the fixed e. Due to these weakenesses, 
symbolic reasoning is constrained only to rather trivial cases, and therefore a 
great many of the general beliefs of a symbolic computation agent would have 
to be basic or to be learnt from singular instant beliefs, if only quantitatively 
reliable inferences are to based on the general beliefs and if the set of such 
beliefs should be large and practically interesting. But that seems to be im- 
possible even for a high-level agent like a human being, let alone for low-level 
agents, since, as we hypothesize, the set of reasoned general beliefs will usually 
have to drastically outnumber the set of learnt or basic beliefs, at least, if the 
agent concerned is a symbolic computation agent. 

As a consequence, a symbolic computation agent has to represent the 
contents of the general beliefs on which her quantitatively highly reliable in- 
ference are to be based, differently, if the agent should be able to draw a 
large number of quantitatively reliable inferences. E.g., the agent might rep- 
resent the contents of the general beliefs not just by conditionals, but by 
conditionals plus their associated probabilities, s.t., say, a complex expres- 
sion a[x] ^ (3[x]\probactiP[x]\o:[x]) is stored in a symbolic knowledge base, 
where probact{P[x] \ a[x]) is a name for the real number Probact{P[x]\ a[x])^ 
and Probact{P[x] \ Q;[x]) > 1 — e for some small pre-defined e. This might either 
be interpreted in the way that the agent has a categorical belief the content 
of which is expressed by ‘Probact{l3[x] \ a[x]) > 1 — e’, or that she has the be- 
lief that [a[x] ^ j3[x] is true] with a degree of Probact{P[x] \ a[x]) > 1 — e. In 
this way, the agent would indeed be able to reason from a set of such condi- 
tional/number expressions to further expressions of this kind by the application 
of rules: the agent could do this in the same way as statisticians do, i.e., by 
applying rules of inference to (i) the axioms of the probability calculus, i.e., of 
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probability measure spaces, (ii) the axioms of real analysis, and (iii) the axioms 
of first-order predicate calculus; or she could do so by applying rules that can 
be extracted from these axioms. If the agent’s internal language is sufficiently 
expressive, we know that the agent is principally incapable of reasoning to 
all expressions a[x] P[x]\probact{l3[x]\a[x\) for Prohact{l3[x]\a[x]) > 1 — e 
by such rules, due to the notorious incompleteness theorems affecting ax- 
iomatizable extensions of first-order arithmetic; but this problem is practi- 
cally irrelevant. The real problem is that such computations cannot be carried 
out by a low-level agent: presumably, the recursion-theoretical complexity of 
large and non-trivial sets of such expressions a[x] ^ (3[x]\probact{P[x]\a[x]) 
for Probact{P[^] \ Ci[x\) > 1 — e is high, just as the proof-theoretical complexity 
of the first-order theory sketched above is high. It seems improbable that these 
complexities could be reduced in a way such that also low-level agents would be 
able to conclude many of such expressions while preserving probabilities above 
1 — e; being a low-level agent and being a statistician is mutually exclusive, 
so to speak. Pearl[118], pp.494f, expresses this in the following way: “Why not 
develop a logic that characterizes moderately high probabilities, say probabili- 
ties higher than 0.5 or 0.9 - or more ambitiously, higher than a, where a is a 
parameter chosen to fit the domains of the predicates involved? The answer is 
that any such alternative logic would be extremely complicated and probably 
would need to invoke many of the axioms of arithmetic . . . No logic is known 
that can faithfully replace arithmetic in reasoning about majorities. Moreover, 
it appears that every logic capable of producing sound and complete inferences 
about majorities (let alone other thresholds of proportions) is bound to have 
the complexity of arithmetic inequalities.” We have not defined what we mean 
by dow-level agent’ precisely, but, in any case, the term should exclude agents 
drawing inferences on the basis of explicit complex mathematical calculations. 

A third way of dealing with quantitative reliability is proposed by 
Schurz [147]: he suggests a symbolic computation architecture where com- 
plex expressions of the form a[x] => l3[x] \ fa , /3 are used as the representa- 
tions of the contents of general beliefs, s.t. fa,f 3 is a name for a real num- 
ber and 1 — Va ,(3 is some lower boundary for Probact{P[^]\(^[x])^ i.e., 

Probact{/3[x] \ q;[x]) > 1 — ra ,/3 > 1 — e for some fixed but task-specific e; in this 
way, the agent does not need to acquire precise probabilities by learning pro- 
cesses but just some precise lower boundaries Va ,(3 for Probacti P[x] \ a[x]). The 
reasoning part of the agent is divided up into an assumption generation mecha- 
nism, a lower boundary propagation mechanism, and the application of the rules 
of inference of a logical system that is the further development of a well-known 
system called T’ (see also Schurz [148]) where the rules are applied to expres- 
sions like a[x] ^ l3[x]\fa,(3 in order to derive new conditionals j[x] => 6[x]. The 
system P is a rich and powerful logical system which is sound and complete 
with respect to the probability semantics for high probability conditionals. It is 
to be discussed in part III. Thus, the derivation mechanism uses the rules of a 
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system which is adequate for vague probability and which is sufficiently simple 
to be counted as low-level if symbolic computation is counted as low-level at all. 
The lower bound propagation mechanism calculates for y[x] (5»[x], which has 
been derived from a set of expressions like a[x] /3[x]| a new lower prob- 
ability boundary 1 — this may be achieved by relatively simple real number 
operations that might also be implemented within a low-level agent by means 
of symbolic computation. The combination of all mechanisms leads to the sym- 
bolic computation of an expression 7 [x] => ^ 7,<5 ffi the way that ^[x] => 5[x] 

is derived from given conditionals and plausible further assumptions, is cal- 
culated by the lower bound propagation mechanism, and it is finally checked 
whether 1 — is larger than 1 — e; if this is the case, ^[x] => 5[x]\f^^s is 
added to the symbolic knowledge base. This architecture leads to an elegant 
and relatively simple second-order quantitatively reliable generation of expres- 
sions 7 [x] => s.t. a first-order quantitatively reliable nonmonotonic 

inference can be based on the presence of ^[x] 6[x]. But this does not en- 

tail that a simple symbolic computation agent would be able to draw large, 
relevant, interesting, and representative amounts of quantitatively reliable in- 
ferences: let us assume that each of the expressions a[x] => /3[x]| which are 
basic, or which are the outputs of learning processes, or which are produced by 
the assumption generator, are such that a[x] /3[x] is highly probable in the 
vague sense; in such a case, further expressions jIx] S[x]\f^^s are derived by 
rules which are sound and complete with respect to high probability preser- 
vation, i.e., also ^[x] S[x] is highly probable, where ‘highly probable’ is a 
vague term again. Thus, the reliability concept relative to which Schurz’s sym- 
bolic computation agents produce “large” amounts of reliable inferences is not 
so much high quantitative reliablity simpliciter, but rather the conjunction of 
high quantitative and high qualitative reliability. Since conditionals are derived 
by rules for high probability preservation, many important instance expressions 
of the form ^[x] will indeed not be producable at all although the 

precise probability of ^[x] =4> 5[x] is larger than 1 — e. Similarly, since every 
derived assumption is tested for its lower boundary being larger than 1 — e, 
also many important instance expressions of the form ^[x] S[x]\f^^s will not 

be producable by such an agent although the probability of j[x] ^ S[x] may 
be vaguely classified as high. Schurz’s proposal leads therefore to an architec- 
ture for symbolic computation agents which are relatively simple and which are 
able to produce large amounts of both quantitatively and qualitatively reliable 
inferences. Such a conjunctive reliability notion is particularly adequate for 
those compromise cases in which qualitative reliability is not cautious enough, 
but where quantitative reliability alone is useless because a simple agent ar- 
chitecture is intended. But that does not show that there would be a simple 
symbolic computation architecture which would match quantitative reliability 
adequately. 




Consequences: Why We Opt for Qualitative Reliability 



135 



As we have seen, there seem to be good reasons to assume that the Sim- 
plicity desideratum from above excludes a quantitative notion of reliability, at 
least as far as symbolic computation agents are concerned. This does, of course, 
not in any way contradict the typical Bayesian approaches to justification - it 
only contradicts Bayesianism if a notion of justifiedness is to be explicated that 
is positively applicable to low-level symbolic computation agents. This view is 
strengthened by the fact that also the Feasibility postulate from above calls any 
symbolic computation implementation of quantitatively reliable inference into 
question, since the proof-theoretically oriented theorem derivation algorithms 
for complex axiom systems are notoriously problematic concerning complexity 
issues; also the more semantically oriented algorithms computing qualitatively 
reliable inferences are not feasible since they usually presuppose satisfiability 
checks for propositional formulas in order to decide logical ent ailment; but 
testing for satisfaction in propositional logic is well-known to be NP-complete. 



7.7 Consequences: Why We Opt for Qualitative Reliability 

We are left with three possible lines of reasoning that one might follow now: (i) 
obviously, there is no useful low-level notion of reliability at all; (ii) agents that 
are capable of drawing a great many quantitatively reliable inferences cannot 
generate such inferences by symbolic computation or by symbolic computation 
alone, but they have to use different cognitive means; (iii) a quantitative no- 
tion of reliability is not usefully applicable to low-level agents, but perhaps a 
qualitative notion may be applied. 

(i) contradicts our intuitions regarding the justifiedness of the cat’s 
inference in our cat&bird example. Simplicity is just a way of expressing the 
low-level postulate formulated in section 6.2. 

(ii) is of course open to appraisal. It is perfectly possible that there 
is a “simple” implementation of large sets of quantitatively reliable inferences, 
where symbolic computation is avoided or complemented. E.g., connectionism 
(which we are going to deal with in more detail in part IV) claims that neural 
networks with distributed representations are the adequate cognitive architec- 
tures for “statistical inferences” (see, e.g., Smolensky[162]); however, first of 
all, it is still by far not clear if this is actually true, and, secondly, it is not 
clear what is meant precisely by “statistical” ~ if this just means probabilis- 
tic reliability in the qualitative sense, this is again no argument in favour of 
the possibility of low-level agents drawing quantitatively reliable inferences. Of 
course, there are indeed network implementations for probabilistic inferences 
where precise probabilities are supplied, but those are not connect ionist archi- 
tectures: in the so-called Bayesian networks (see Pearl[118]), representation is 
localized in the way that nodes represent random variables and edges typically 
represent causal relationships. A Bayesian network is an economical representa- 
tion of a single probability measure, and by that, potentially, also an economical 
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representation of the contents of general beliefs. The contents of singular beliefs 
are represented by means of setting the values of some of the variables (nodes) 
in the network. An inference is usually regarded as the computation of the a 
posteriori probability of the value of each variable (node) when the values of 
some variables are given; this computation is normally carried out by a com- 
puter that executes an algorithm, i.e., the Bayesian network is supplemented 
by a symbolic computation unit. In our context, a (then reliable) inference 
would consist in the sketched probability revision plus a subsequent check for 
the lower reliability boundary 1 — e. Despite their merits, Bayesian networks 
have the following shortcomings (for a more detailed discussion of these short- 
comings see, e.g., Dubois&Prade [41], pp.5-8): (i) precise a priori probabilities 
are needed for every possible content of belief, s.t., incompleteness with respect 
to general beliefs cannot be handled by a Bayesian network agent; (ii) many 
contents of general beliefs lack any obvious causal basis - e.g., most students 
are young, but studentship does not cause youth; but, at least under their usual 
interpretation, Bayesian networks are only causal belief networks rather than 
belief networks in general. Even if we disregard (ii): by (i), Bayesian network 
agents need to have total general knowledge in order to draw quantitatively 
reliable inferences, but it is neither realistical for a low-level agent (at least a 
natural one) that such vast amounts of knowledge would be basic, nor that they 
could be learnt. Concerning the Feasibility criterion from above, it is known 
that inference in Bayesian networks is NP-hard and thus also problematic with 
respect to complexity considerations, if the actual inference is to be carried out 
by a symbolic computation unit that is associated with the network. 

(iii) In chapter 8 we are going to define primary justifiedness for in- 
ferences on the basis of qualitative notions of reliability. In part III we will 
see that these notions conform to sound, complete, and expressive systems of 
logical rules with a clear-cut semantics; by the application of such rules, a low- 
level agent may draw large sets of qualitatively reliable inferences by means 
of symbolic computation, i.e.. Simplicity is satisfied. In part IV we will show 
that the same can be done by low-level network agents under distributed rep- 
resentation, s.t. even Feasibility is plausible to be satisfied, while it is not clear 
whether Feasibility can similarly be realized by a symbolic computation agent. 
By the transition from quantitative reliability to qualitative reliability, infor- 
mation is “lost”, just as a sentence expressing that Probact{P[x] \ a[x]) > 1 — 6 
for a specific small e carries more amount of information than a high proba- 
bility or normic conditional expressing that Probact{P[x] \ (y-[x\) is high or that 
normal a’s are /5s. But this “loss” of information leads to the positive con- 
sequence that also low-level agents are capable of drawing reliable inferences, 
which, intuitively, they do indeed. Low-level agents are only able to represent 
high quantitative probabilities after a process of “amplification” by which the 
quantitative probabilities are transformed into qualitative ones. For the same 
reasons, we use vague expressions like ‘most . . . are . . .’ or ‘normal . . . are . . .’ 
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in natural language as means of complexity reduction, either because a higher 
degree of complexity cannot be attained, since precise probability values are 
beyond reach, or because a higher degree of complexity is unnecessary or even 
contraproductive. When we deal with the semantics of high probability and 
normic conditionals in part III, we will see that the latter only express or- 
ders of objects, s.t., one object is of higher probabilistic order of magnitude 
than another, or one object is more normal than another, whereas probability 
statements express more complex issues like the assignments of real numbers 
to (sets of) objects. Finally, we assess the cat’s inference in the cat&bird ex- 
ample as justified due to its reliability, although we do not have knowledge of 
the precise proportion of birds that can fiy among all birds; we only know that 
(by far the) most birds can fiy, or that normal birds can fiy. As it seems, also 
our pre-theoretical notion of justifiedness is vague and any formal theory of 
justified inference has to take care of this vagueness, at least if the low-level 
postulate is to be accepted. Goldman[69], p.l07, expresses this in the following 
way (though presupposing a different concept of normality): “Obviously, the 
notion of normal worlds is quite vague. But I do not find that objectionable. I 
think our ordinary notion of justifiedness is vague on just this point.” 



7.8 Qualitative Reliability Fails Rationality (But That Does Not 
Matter) 

Our preference for a qualitative over a quantitative notion of reliability, as far 
as a theory of justified inference for low-level agents is concerned, also has 
its price. This can be seen from the so-called lottery paradox (Kyburg[86]): 
assume that a fair lottery takes place, s.t. exactly one ticket out of a large set 
of, say n, tickets wins. Let pi be a formula that expresses that the z-th ticket 
wins (for 1 ^ i ^ n). The probability that one particular ticket wins is very 
low, i.e., for all i: the probability that the i-th ticket is not going to win is 
very high. Formally, this could be expressed by 5PT 1= T =^hp ~'Pii for all z, 
where 1 ^ z ^ n. Intuitively, this implies that DJI ^ T =^hp ~^Pi A ... A ^p^ 
and this intuition can even be justified semantically in part III, when we will 
have stated the semantics for high probability conditionals: we simply iterate 
a valid rule that is called ‘And-Rule’ or ‘Conjunction Rule’, by which, if (by 
far the) most as are (3s, and (by far the) most as are 7s, it follows that (by 
far the) most as are (3 A 7s. On the other hand, the probability that at least 
one of the tickets is going to win is 1, and therefore also very high: thus, 
N T Pi V . . . V Pn- But there is no high probability model satisfying 
both OT N T ^hp ~'Pi A ... A -ipn and OT N T ^hp Pi V ... V Pn, since 
Pi V ... V Pn and -ipi A ... A ->Pn are logically contradictory. In words: although 
it is quantitatively reliable to infer from T to each ~^pi separately, it is not 
quantitatively reliable to infer from T to all ->piS collectively. According to 
qualitative reliability, if each of the single inferences are reliable, this also holds 
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for the latter conjunctive inference, which is irrational as far as the lottery 
example is concerned. Therefore, if a low-level agent’s inferences are to be 
justified by a qualitative ‘‘measure” of reliability, as we have urged before, the 
set of justified inferences that she might draw is usually not going to coincide 
with the set of rational inferences that she might draw. The reason for this 
discrepancy is that, although if (by far the) most as are /3s, and (by far the) 
most as are 7 s, then intuitively also (by far the) most as are (3 A 7 s, the 
conditional probability Probact{l3[x\ A 7 [x]|a[x]) that is underlying the latter 
conditional may be slightly smaller - though still being high - than each of the 
probabilities Prohact{f3[x\ \ a[x]) and Probact{7[^]\ ct[x]). This small decrease of 
probability may add up to a probability below 1 — e, but precisely that fact 
cannot be expressed by high probability conditionals. The notion of reliability 
that is defined by the latter conditionals is obviously more “brave”, i.e., more 
hazardous, than the one that is given by precise probabilities. By using high 
probabilities we have been able to get rid of the rather arbitrarily pre-selected 
boundary e, but, by the same move, we are now no longer able to describe 
the lottery paradox situation by means of high probability conditionals, s.t., a 
low-level agent would not draw inferences which are at the same time irrational 
but on the other hand justified according to a qualitative notion of reliability. 
We might try to describe the situation in the way that Tl\= T pi V . . . 
is the case while OT N T =>hp ~^Pi is not (for any i with 1 ^ i ^ n), but 
still only inappropriately so: our low-level agent A might then have the general 
belief that [T ^^p Pi V . . . V p^], and draw a primarily justified inference 
from T to Pi V ... V pn on the basis of the latter general belief, while she 
would not justifiedly infer from T to -ip^, which is again irrational given the 
setting of the lottery paradox. On the other hand, we are tempted to say that 
those human agents which are not stochastically pre-eductated indeed behave 
as they “should” according to this notion of justifiedness based on qualitative 
reliability: obviously, a great many people gamble, and if they are asked for their 
reasons why they do so despite having this vague feeling that the chances for 
winning are low, they say: “well, but isn’t there nevertheless always someone 
who wins in the end?” But whether this really describes the way in which 
many people reason in such situations correctly or not, is of course a question of 
empirical research by psychological experiment; thus this remark should neither 
be taken too seriously, nor is it particularly relevant for the philosophical point 
which we wanted to make. 

Does this contradict our efforts to define justifiedness for inferences 
drawn by low-level agents qualitatively! No - it just means that quantitative 
and qualitative reliability establish distinct concepts of justification: while the 
former is closely associated with both theoretical and practical rationality, the 
latter is associated with justification under the low-level postulate; while, ac- 
cording the former notion of reliability, only high-level agents are able to draw 
large amounts of justified (rational) inferences, also low-level agents may draw 
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justified inferences according to the latter. We even hypothesize that only com- 
plex symbolic computation agents are able to draw justified inferences accord- 
ing to the first kind of justification which is based on quantitative reliability. 
On the other hand, low-level agents may indeed draw justified inferences ac- 
cording to qualitative reliability even if they do not generate these inferences 
by symbolic computation but, e.g., by a simple connectionist architecture. The 
second claim is proved in part IV. The first claim has to be left open, and it is 
not even clear how this might be stated more precisely, let alone proved. If it 
were true, this would strongly indicate that an agent which draws justified in- 
ferences according to an externalist notion of justification based on quantitative 
reliability, is also doing so according to an internalist conception of justification 
(though not necessarily to a classical internalist conception). 

Both our vague common sense understanding of reliability and the 
Simplicity and Feasibility problems that affect quantitative reliablity have been 
major driving forces behind (qualitative) Nonmonotonic Reasoning in the AI 
camp, though not always stated as such explicitly. H In epistemology and phi- 
losophy of science it is mainly the justification of human beliefs, especially 
of scientific beliefs, that is studied, and thus qualitative reliability has been 
disregarded at least up to the Seventies in favor of a (mostly Bayesian) high- 
level theory of rationality. Only recently, qualitative and quantitative reliability 
have been discussed on equal terms in both the philosophical and the computer 
science context. 



7.9 Qualitative Reliability vs. Monotonic and Nonmonotonic Infer- 
ences 

So we are finally left with one qualitative notion of absolute (conditional) relia- 
bility for inferences, and two qualitative notions of high (conditional) reliability 
for inferences. Up to now we have only focused on the reliability of inferences 
in general^ independently of whether they are monotonic or nonmonotonic. So 
the question remains whether the classification of the qualitative notions of 
reliability into absolute and high reliability corresponds in some way to the 
classification of inferences into monotonic and nonmonotonic ones. 

Since absolute reliability is now defined by a universal conditional, 
and thus by a strict conditional, it is a monotonic notion of reliability: if it 
is absolutely reliable to infer from a[a] to /3[a], it is also absolutely reliable to 
infer from a[a] A 7 [a] to /?[a]. On the other hand, both of the notions of high 
reliability which we have opted for are defined by defeasible conditionals - in 
the first case by a high probability conditional, in the second case by a normic 
conditional - and, consequently, they are nonmonotonic notions of reliability: 

II But, e.g., Bacchus[14] proposes a symbolic computation architecture for computer agents 
that are disposed to draw — in our language — quantitatively reliable inferences. 
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if it is highly reliable in the probabilistic sense to infer from a[a] to /?[a], it is 
not necessarily highly reliable in the probabilistic sense to infer from a [a] A 7 [a] 
to p[a]; if it is highly reliable in the normic sense to infer from a[a] to /3[a], it 
is not necessarily highly reliable in the normic sense to infer from a[a] A 7 [a] to 
/3[a\. 

Let us now consider a case in which A draws, say, a direct monotonic 
inference from a[a] to /3[a], and she does so (primarily) justifiedly, i.e., where 
it is reliable to infer from a[a] to /3[a] according to one of the two qualitative 
notions of high reliability. By our def.22 of inference ascription in part I, A’s 
belief that [a[a] is true] is a conclusive reason for A’s belief that [/3[a] is true]; 
by corollary 24 in part II, this implies that A is disposed to change to and 
remain in the belief that [P[a] is true], given the belief that [a[a] A 7 [a] is 
true] for arbitrary 7 [a] G C. But it might be the case that for some such 
7 [a], it is indeed not highly reliable to infer from a[a] A 7 [a] to f3[a], since 
high reliability is not necessarily monotonic. Therefore, if we defined primary 
justification for monotonic inferences by means of high reliability, a justified 
monotonic inference of A could be based on a general belief of A which might 
have a dispositional substate by which A would be disposed to change to and to 
remain in a belief that would intuitively not be justified given A’s initial beliefs. 
In such a case, there is an intuitive discrepancy between the monotonicity of 
the inference that is ascribed and the nonmonotonicity of the reliability notion 
that is employed in such a case, and therefore we prefer to define reliability for 
monotonic inference by absolute reliability. 

Thus, for the special case of deductive, and therefore also monotonic, 
inferences we have: A draws in {s{t — nAtc), • • • , s{t)) a primarily justified de- 
ductive inference from the belief that [a[a] is true] to the belief that [/3[a] is 
true] relative to the cognitive history if and only if it is qualita- 

tively absolutely reliable for A to infer from a[a] to /3[a], i.e., if and only if 
Mact 1= a [a:] -4 ^[x]. 

There is a similar mismatch between primarily justified nonmonotonic 
inferences on the one hand and absolute reliability on the other: assume that 
we defined A nonmonotonic inference from a[a] to f3[a] to be primarily justified 
if and only if it was absolutely reliable to infer from a[a] to /3[a]. None of the 
typical nonmonotonic inferences from A’s total belief that [a [a] is true] to A’s 
belief that [P[a] is true] would turn out to be justified then - e.g., since not all 
birds can fiy, A would not justifiedly infer from Bird{a) to CanFly{a), Thus, 
such a definition of justifiedness would fail to satisfy the Power criterion from 
above. That is the reason why we will define justification for nonmonotonic 
inferences in terms of high reliability. 

In the special cases of high probability inferences and of normic infer- 
ences we will of course define high reliability by the corresponding notions of 
high reliability, i.e., by high probabilistic and by high normic reliability, accord- 
ingly: A draws in {s{t — nAtc), . . . ,s{t)) a primarily justified high probability 
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inference from the belief that [a[a] is true] to the belief that [/3[a] is true] rel- 
ative to the cognitive history if and only if it is qualitatively highly 

reliable for A in the probabilistic sense to infer from a[a] to /3[a], i.e., if and 
only if dJlact ^ =^hp l3[x]. A draws in {s{t — nAtc ), . . . , s{t)) a primarily 
justified normic inference from the belief that [a[a] is true] to the belief that 
[/3[a] is true] relative to the cognitive history if and only if it is quali- 

tatively highly reliable for A in the normic sense to infer from a[a] to /3[a], i.e., 
if and only if ^))lact ct[x] =^nor /?N- 



7.10 Second-Order Reliability 

A second-order process by which an inference process is acquired has been said 
to be (second-order) reliable if and only if it tends to produce more (first- 
order) reliable inference processes than unreliable ones. Since we restrict our- 
selves to second-order processes which are (i) either learning processes, or (i) 
second-order monotonic reasoning processes which are at the same time first- 
order monotonic or first-order nonmonotonic (for the terminology recall section 
6.3.2), or (iii) inductive reasoning processes, and since each of the latter pro- 
cesses leads from beliefs to the general beliefs that inferences are based on, 
we suggest to use also a conditional notion of secondary reliability for second- 
order processes. We do so for the same reasons why we have suggested to use 
a conditional notion of first-order reliability for inferences. 

Moreover, by our supposition in section 6.3, the second-order processes 
of A either produce universal beliefs, or high probability beliefs, or normic 
beliefs. The contents of each of the latter kinds of beliefs refer to all objects 
in our epistemological sandbox territory, and, as we have assumed in chapter 
2, the general sentences that express the contents of such beliefs are time- and 
place-invariant. Thus, also the (first-order) conditional reliability of an inference 
that is based on a such a particular general belief is invariant concerning place 
and time. As far as a particular second-order process from a particular set of 
beliefs to a further belief is concerned, we can therefore replace the phrase 
‘tends to produce more (first-order) reliable inference processes than unreliable 
ones’ by the more simple ‘produces a (first-order) reliable inference process’. 
So we actually have: a learning process leading from a sequence of singular 
beliefs to a general belief is (second-order) conditionally reliable if and only if 
the inference that is based on the general conclusion belief is [in all possible 
cases/in most cases/in normal cases] (first-order) conditionally reliable given 
that each of the singular premise beliefs is true; a monotonic or nonmonotonic 
reasoning process leading from a set of general beliefs to a further general 
belief is (second-order) conditionally reliable if and only if the inference that is 
based on the general conclusion belief is [in all possible cases/in most cases/in 
normal cases] (first-order) conditionally reliable given that each of the general 
premise beliefs is true. Finally: an inductive reasoning process that leads from 
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a sequence of singular beliefs to a general belief via learning processes and 
subsequent monotonic or nonmonotonic reasoning processes is (second-order) 
conditionally reliable if and only if the inference that is based on the general 
conclusion belief is [in all possible cases/in most cases/in normal cases] (first- 
order) conditionally reliable given that each of the singular premise beliefs is 
true. In this way, the notion of second-order conditional reliability is derivative 
on the notion(s) of first-order conditional reliability that we have dealt with 
in the last section, when we have suggested to use qualitative notions of first- 
order conditional reliability. If we choose a notion of second-order reliability, 
s.t. first-order reliability of the resulting inference is guaranteed in all possible 
cases, then we may speak of absolute second-order reliability; if we only demand 
first-order reliability in most cases, or in normal cases, we may speak of high 
second-order reliability. 

Let us now concentrate just on reasoning processes - neglecting learn- 
ing processes and thus the induction problem -, and, as pointed out already in 
section 6.3, also just on those reasoning processes that are second-order mono- 
tonic, i.e., that lead from a state {not a total state) in which the agent has 
some general premise beliefs (and perhaps further general beliefs) to a state in 
which the agent has a general conclusion belief, where the reasoning processes 
are additionally either first-order monotonic or first-order nonmonotonic, and 
either all of the general beliefs involved are universal beliefs, or all of them are 
high probability beliefs, or all of them are normic beliefs. For the same reason 
as in the last section, second-order monotonicity is most adequately matched 
by absolute second-order reliability; by using a qualitative notion of absolute 
second-order reliability we do not have to deal with high-order probabilities of 
sets of models, or a higher-order normality order for models. By the results of 
the last section, we may replace (i) talk of the first-order conditional reliability 
of the deductive inference from a[a] to j3[a] by talk of the truth of the universal 
conditional a[x\ /3[x], (ii) talk of the first-order conditional reliability of the 
high probability inference from a[a] to (d[a] by talk of the truth of the high 
probability conditional a[x] ^hp /^[^], and (iii) talk of the first-order condi- 
tional reliability of the normic inference from a[a] to p[a] by talk of the truth 
of the normic conditional a[x] =^nor /?[^]* So we can finally ascribe second- 
order reliability to reasoning processes in the following way: a second-order 
monotonic reasoning process that leads from a set of universal/high probabil- 
ity /normic premise beliefs to a universal/high probability /normic conclusion 
belief is (second-order) reliable if and only if in every possible case in which 
each of the premise beliefs is true, also the conclusion belief is true. When we 
use the phrase fin every possible case’ we refer to all models dJl which are of 
the same type as our “actual” model OTact- Iii case of universal premise 
beliefs, this is just a restatement of the usual definition of logically ent ailment 
for a special class of formulas in the language of first-order predicate logic. We 
will see in parts III and IV that the thus defined notion of absolute second- 
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order reliability conforms to the Simplicity postulate, since it has got simple, 
qualitative, sound and complete proof systems. Correctness and Power are also 
satisfied. However, Feasibility is questionable if a symbolic computation agent 
is used which reasons by constructing derivations in the corresponding proof 
systems. 




Chapter 8 

A THEORY OF JUSTIFIED INFERENCE 



In the following we will complement our intuitive account of justified inference 
in chapter 6, and our account of reliability in chapter 7, by a more formal 
theory. For the philosophical motivation of the theory see the corresponding 
sections in the chapters 5, 6, and 7. 

8.1 Basic Inferences and Acquired Inferences 

Let be a trajectory of parameter-settings of A - a possible “cognitive 

history” of A - where in is a set of successive points of time. Let furthermore 
be some strict conditional and ^ be some defeasible conditional. Ate is 
again the fixed period of time that it takes a belief to be caused and at the 
same time the amount of time for which a belief is sustained. 

Definition 46 (Ascription of Basic/ Acquired Beliefs /Direct Inferences I) 

Let {s{t — Ate), • • • , s{t)) be a subsequence of {s{t))^^j^: 

1. A has at t rel. to {s{t))^^j^ the basic general belief that [a[x] /3[x] is 

true] iff 

(a) s{t) N B{a[x\ ^[x]) 

(b) for all t' < t: s{t) N B{a[x] l3[x]) 

2. A has at t rel. to {s{t))^^j^ the acquired general belief that [a[x] /3[x] 

is true] iff 

(a) s{t) N B{a[x] /3[x]) 

(b) there is at' < t, s.t s{t) B{a[x] l3[x]) 

3. A draws from t — Ate to t rel. to {s{t))^^j^ the basic direct monotonic 
inference from a[a] to P[a] iff 

(a) {s{t - Ate), • • • , s{t)) N a[a] -^infd p[a] 
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(b) there is a strict conditional a[x] ^ P[x], s.t. 

A has at t — Ate rel. to {s{t))^^j^ the basic general belief that [a[x] 
/3[x] is true] 

4 . A draws from t — Ate to t rel. to the acquired direct monotonic 

inference from a[a] to /3[a] ijf 

(a) {s{t - Ate ), . . . , s{t)) 1= a[a] ^infd fd[o] 

(b) for all strict conditionals a[x] /3[x]: 

if A believes at t — Ate that [a[x] /3[x] is true], then A has at 

t — Ate to {s{t))^^j^ the acquired general belief that ]a[x] P[x] 
is true] 

5. A has at t rel. to {s{t))^^j^ the basic general belief that ]a[x] => P[x] is 
true] iff 

(a) s{t) N B{a[x] P[x]) 

(b) for all t' < t: s{t) 1= B{a[x] j3[x]) 

6. A has at t rel. to {s{t))^^j^ the acquired general belief that ]a[x] ^ (3[x\ 
is true] ijf 

(a) s{t) N B{a[x] P[x]) 

(b) there is at' < t, s.t. s{t) B{a[x] => /3[x]) 

7. A draws from t — Ate to t rel. to («^(0)tG/n basic direct nonmonotonic 
inference from a[a] to P[a] ijf 

(a) {s{t - Ate), • • . , s{t)) N a[a] ^infd P[a] 

(b) there is a defeasible conditional a[x] => P[x], s.t. 

A has att — Ate rel. to {s{t))^^j^ the basic general belief that ]a[x] ^ 
P[x] is true] 

8. A draws from t — Ate to t rel. to the acquired direct nonmono- 

tonic inference from a[a] to P[a] iff 

(a) {s{t - Ate), ..., s{t)) N a[a] =^in/d P[a] 

(b) for all defeasible conditionals a[x] ^ P[x]: 

if A believes at t — Ate that ]a[x] => P[x] is true], then A has at 
t — Ate to (^(0)tG/n acquired general belief that ]a[x] /3[x] 

is true]. 
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A direct inference is thus defined to be basic if it is based on at least 
one basic general belief - recall that a direct inference may be based on several 
general beliefs the contents of which are expressed by conditionals with the 
same antecedents and consequents but where the conditionals might differ with 
respect to the implication connective employed. 

Now we could easily define what the (i.e. A’s) basic/acquired direct 
monotonic/ nonmonotonic inference from o; to /? is, and what a basic/ acquired 
direct monotonic/ nonmonotonic inference is; however, since these definitions 
would be completely parallel to our def.27 of direct monotonic/nonmonotonic 
inferences except for a relativization to the history of the agent, we omit them. 

Analogously as in def.46 we can ascribe basic/acquired direct univer- 
sal, high probability, normic beliefs, and thus ascribe basic/acquired direct 
deductive, high probability, normic inferences: 

Definition 47 (Ascription of Basic/ Acquired Beliefs /Direct Inferences II) 

Let {s{t — Ate), • • • 5 s{t)) be a subsequence of {s{t))^^j^: 

1. A has at t rel. to {s(t))^^j^ the basic universal belief that [a[x] — > p[x] is 
true] ijf 

(a) s{t) N B{a[x] (3[x]) 

(b) for all t' < t: s{t) N B{a[x] P[x\) 

2. A has at t rel to {s{t))^^^^ the acquired universal belief that [a[x] ]3[x] 

is true] ijf 

(a) s{t) N B{a[x] j3[x]) 

(b) there is at' < t, s.t. s{t) B{a[x] fl[x]) 

3. A draws from t — Ate io t rel. to {s{t))^^j^ the basic direct deductive 
inference from a[a] to f3[a] iff 

(a) {s(t - Ate), • • • , s{t)) N a[a] (3[a\ 

(b) A has at t — Ate to the basic universal belief that 

]a[x] — > l3[x] is true] 

4 . A draws from t — Ate to t rel. to {s{t))^^j^ the acquired direct deductive 
inference from a[a] to P[a] iff 

(a) {s{t - Ate), ■■■, s{t)) 1= a[a] ^infd (i[a] 

(b) A has at t — Ate '^'d. to {s{t))^^j^ the acquired universal belief that 

]a[x] l3[x] is true] 
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5. A has at t rel. to {s(t))^^j^ the basic high probability belief that [a[x] =>hp 
P[x] is true] iff 

(a) s{t) N B{a[x] ^hp P[x\) 

(b) for all t' < t: s{t) 1= B{a[x] ^hp P[x\) 

6. A has att rel. to {s{t))^^j^ the acquired high probability belief that [a[x] =^hp 
p[x] is true] iff 

(a) s{t) h B{a[x] ^up (d[x\) 

(b) there is a t' < t, s.t. s{t) B{a[x] ^hp P[x]) 

1. A draws from t — Atc to t rel. to basic direct high probability 

inference from a[a] to p[a] iff 

(a) {s(t Atfj^ • • • 5 ^(^)) ^ Q^[n] hpinf^ /^[^] 

(b) A has at t — Ate Tel. to {s{t))^^j^^ the basic high probability belief 
that ]a[x] ^hp P[x] is true] 

8. A draws from t — Ate to t rel. to {s{t))^^j^ the acquired direct high prob- 
ability inference from a[a] to j3[a] iff 

(a) {s{t - Ate), • • • , s{t)) N a[a] =^hpmfd P[o] 

(b) A has at t — Ate tcI. to (5(t))te/n acquired high probability belief 
that [a[x] ^hp P[x] is true] 

9. A has at t rel. to {s{t))^^j^ the basic normic belief that ]a[x] =^nor P[x] 
is true] iff 

(a) s{t) 1= B{a[x] ^nor ^[x]) 

(b) for all t' < t: s{t) \= B{a[x] ^nor P[x]) 

10. A has att rel. to (s{t))^^j^^ the acquired normic belief that ]a[x] ^nor fd[x] 
is true] iff 

(a) s{t) 1= B{a[x] =>nor P[x]) 

(b) there is at' < t, s.t. s{t) B{a[x] ^nor P[x]) 

11. A draws from t — Ate to t rel. to {s{t))^^j^^ the basic direct normic infer- 
ence from a[a] to j3[a] iff 

(a) {s{t - Ate), • • • , s{t)) h a[a] ^ninf<^ /^H 

(b) A has at t — Ate Tel. to {s{t))^^j^^ the basic high probability belief 
that [a[x] =>nor /d[x] is true] 
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12. A draws from t — /Ate to t rel to {s{t))^^j^ the acquired direct normic 
inference from a[a] to P[a] ijf 

(a) {s{t - Ate), . . . , s{t)) N a[a] =>ninfd (d[a\ 

(b) A has att — Ate to {s{t))^^j^ the acquired high probability belief 
that [a[x] ^nor P[^] is true]. 



So we can ascribe basic/ acquired inferences, which are not necessar- 
ily direct (the following definitions are structured analogously to their direct 
versions): 

Definition 48 (Ascription of Basic/ Acquired Inferences I) 

Let {s{t — nAte ), . . . , s{t)) be a subsequence of {s{t))^^j^: 

1. A draws from t — nAte to t rel. to {s{t))^^j^ the basic monotonic inference 
from a[a] to j3[a\ iff 

(a) {s{t - nAtc), s{t)) N a[a] -^inf (3[a] 

(b) each direct component inference is basic, i.e. 

for all a[a] = 7 o[a], 7 i[a], . . . , 7 „[a] = P[a] £ C, s.t. 

i. {s{t - nAtc), ■■■,s{t-{n- l)Atc)) t= 7o[a] 7iH 

n. (s(^ - (n - l)Aic), ...,s{t-{n- 2 )Atc)) 1= 7i[«] ^inf“ 72W 

Hi. {s{t - Ate), ■■■, s{t)) 1= 7 „_i[a] ^i„fd 7 „[a] 
it follows that: 

A draws from t — nAtc to t — {n — l)Atc rel. to {s{t))^^j^ the basic 
direct monotonic inference from 70 [a] to 71 [a] , 

A draws from t — {n — l)Atc to t — {n — 2) Ate to {s{t))^^J,^ the 

basic direct monotonic inference from 71 [a] to 72 [a] , 



A draws from t — Atc to t rel. to (s{t))^^j^ the basic direct monotonic 
inference from 7^-1 [a] to 7 n[a] 

2. A draws from t — nAtc to t rel. to {s{t))^^j,^ the acquired monotonic 
inference from a[a] to f3[a] iff 

(a) {s{t - nAtc), . . • ,5(t)) N a[a] ^inf P[a] 

(b) at least one direct component inference is acquired, i.e. 
there are a[a] = 7 o[a], 7 i[a], . . . , 7 ri[a] = P[a] G C, s.t. 
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i. {s{t - nAtc), ...,s{t-{n- l)Atc)) 1= 7oH 7i[a] 

a. {s{t- (n- l)Atc),...,s{t- {n- 2 )Atc)) ^71(0] -^tnfd 72H 



in. {s{t - Ate), • • • , s{t)) 1= 7„_i[a] -^infd 7n[a] 
and: 

A draws from t — nAtc to t — {n—l)Atc rel to {s{t))^^j^ the acquired 
direct monotonic inference from 70 [a] to 71 [a] ^ or 
A draws from t — {n~ l)Atc to t — {n — 2 ) Ate rel. to {s{t))^^j^ the 
acquired direct monotonic inference frvm 71 [a] to 72 [a] ^ or 



A draws from t — Ate to t rel. to {s{t))^^j^ the acquired direct mono- 
tonic inference from 7n-i[«] to 7 n[a] 

3. A draws from t — nAtc to t rel. to {s{t))^^j^ the basic nonmonotonic 
inference from a[a] to /?[a] iff 

(a) {s{t - nAtc ), . . . , s{t)) N a[a] =^inf /3[a] 

(b) each direct component inference is basic* 

f.. A draws from t — nAtc to t rel. to (<^(0)te/n acquired nonmonotonic 
inference from a[a] to P[a] iff 

(a) {s{t-nAtc),...,s{t)) N a[a] =^inf (3[a\ 

(b) at least one direct component inference is acquired. 

Finally, we have the corresponding ascription of basic/ acquired deduc- 
tive, high probability, and normic inferences: 

Definition 49 (Ascription of Basic/ Acquired Inferences II) 

Let {s{t — nAtc), • ♦ • , s{t)) be a subsequence of {s{t))^^j^: 

1. A draws from t — nAtc to t rel. to {s(t))^^j^ the basic deductive inference 
from a [a] to (3[a\ iff 

(a) {s{t - nAtc), ■ ■ ■,s{t)) N a[a] /3[a] 

(b) each direct component inference is basic 

2. A draws from t — nAtc to t rel. to {s{t))^^j^ the acquired deductive infer- 
ence from a [a] to j3[a] iff 

* Phrases like this one which refer to direct component inferences are to be understood 
from now on in analogy to 1 and 2. 
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(a) {s{t - nAtc ), . . . , s{t)) N a[a] ^inf P[a] 

(b) at least one direct component inference is acquired 

3. A draws from t — nAtc to t rel. to {s{t))^^j^ the basic high probability 
inference from a[a] to P[a] iff 

(a) {s{t - nAtc), • • • , s{t)) N a[a] =>hpinf P[o] 

(b) each direct component inference is basic 

4^ A draws from t — nAtc to t rel to {s{t))^^j^ the acquired high probability 
inference from a[a] to P[a] iff 

(a) {s{t - nAtc ), . • . , s{t)) N a[a] =>hpinf P[o] 

(b) at least one direct component inference is acquired 

5. A draws from t — nAtc to t rel. to {s{t))^^j^ the basic normic inference 
from a[a] to (3[a\ iff 

(a) {s{t - nAtc ), . . • , s{t)) 1= a[a] =>ninf P[o] 

(b) each direct component inference is basic 

6. A draws from t — nAtc to t rel. to {s{t))^^j^ the acquired normic inference 
from a[a] to P[a] iff 

(a) {s{t - nAtc), . • . , s{t)) ^ a[a] ^ninf P[o\ 

(b) at least one direct component inference is acquired. 

8.2 Reliability for Inferences 

Now we can restate our definitions of first-order reliability for inferences. Follow- 
ing chapter 2, let DJlact be the model that is associated with our epistemological 
sandbox territory. QJlact bas (i) a “classical” part comprising the interpretation 
mapping 3 act from chapter 2, (ii) a probabilistic part consisting of a probabil- 
ity measure Probact satisfying high probability conditionals, and (iii) a normic 
component including a normality order -<act that is defined in some way on 
Dact^ s.t. the normic part satisfies normic conditionals. Models like SJlact will 
be defined precisely in part III. 

Let a[a], (3[a\ G C: 

Definition 50 (First-Order Reliability) 

1. It is absolutely reliable to draw an inference from a[a] to P[a] iff 
for all d e Dact- if {'3act,d) N a[a] then {3act,d) N (3[a\, i.e., iff 
3Kact ^ a[x]^ /3[x] 
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2. it is highly reliable in the probabilistic sense to draw an inference from 
a[o] to (3[a] iff 

Pact ^ Pact • (ffactt^ ^ ^ /^[^]} !{^ ^ Pact • {^tactj^') ^ 

high, i.e., iff 

*^act ^ Cx[x] ^hp /^[^] 

3. it is highly reliable in the normic sense to draw an inference from a[a] to 

/?[«] iff 

for all d e Pact • if d is maximally normal among the members of the set 
{d' G Pact \ ('3act,d') t= a[a]} measured according to the normality order 
-<act defined on Pact, then {3act,d) 1= P[a], i.e., iff 

OTact 1= ot[x\ nor I3[x\. 

The conditional probability in 2 is defined in def.61 in section 9.2. 
Since, according to chapter 2, is monotonic while ^hp and ^nor 
are nonmonotonic, and thus universal conditionals like a[x] P[x] are strict, 
while high probability conditionals like a[x\ =^hp /^[^] and normic conditionals 
like a[x] ^nor td[x\ are defeasible, it follows^: 

Corollary 51 (Monotonicity and Nonmonotonicity of First-Order Reliability 
Notions ) 

1. Absolute reliability is monotonic, i.e., for all a[a], j3[a], 7 [a] G C: 

if it is absolutely reliable to draw an inference from a[a] to ^[a], then it 
is absolutely reliable to draw an inference from a[a] A 7 [a] to P[a] 

2. high reliability (in either sense) is nonmonotonic, i.e., there are a[o\, 
/3[a], 7[a] G C, s.t: 

it is highly reliable to draw an inference from a[a] to l3[a], but it is not 
highly reliable to draw an inference from a[a] A 7 [a] to P[a\. 

8.3 Reliability for Reasoning Processes 

Now we can define reliability for monotonic or nonmonotonic reasoning pro- 
cesses. Here we do not refer to the specific model OJlact that corresponds to our 
sandbox territory, but to all models of the same “type” as ^))lact ; in part III 
we will define such models, and thus the class of all such models, exactly. In the 
case of nonmonotonic reasoning we are going to include again the case where 
the premise set from which the reasoning proceeds includes strict conditionals; 
but, in any case, at least one the premises should be defeasible. 

'f’More precisely, the “high reliability” claim of the next corollary only follows given that 
Wlact is realistic, and thus not “trivialized” . 
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Let o:o[a:] ^ /3 q [ x] , . . . , [^] ^ /3[x] G for some 

monotonic implication sign ^ that we might consider, and let ao[x] (3o[x], , 

an[x] Pn[x],a[x] => (3[x] e for some nonmonotonic implication sign ^ 
that we might consider: 

Definition 52 (Absolute Second- Order Reliability) 

1. It is absolutely reliable to reason from ao[a:] ^ Po[x]^. . . ,an[x] Pn[x] 
to 

a[x] !3[x\ iff 

for all models 3Jt; z/ 2Jl N ao[x] f3o[x]^ . . . ,an[x] Pn[x] then OT t= 
a[x] P[x] 

2. it is absolutely reliable to reason from ao[x] (3o[x], . . . , Cin[x] -^/=> 

Pn[x] to 

a[x] (3[x] iff 

for all models fJOl: if^)Jl\= o;o[a:] — Po[x]^ . . . ,an[x] (3n[x] then 

Tl 1= a[x] => /3[x]. 

The ‘for all’ clause immediately implies: 

Corollary 53 (Monotonicity of Absolute Second- Order Reliability) 

Absolute second-order reliability is monotonic, i.e., 

for all ao[x] po[x], . . . ,an[x] pn[x],a[x] p[x],^[x] 5[x] G 

for all ao[x] j3o[x], . . . ,an[x] => (3n[x],a[x] => P[x],'y[x] ^ 5[x] G 

1. if it is absolutely reliable to reason from ao[x] j3o[x], ... , an[x] /3n[x] 
to a[x] ^ p[x], then 

it is also absolutely reliable to reason from oo[a:] ^ /3o[x], . . . ,an[x] 
PnM.jix] 6[x] to a[x] P[x] 

2. if it is absolutely reliable to reason from ao[x] Pq[x], . . . , an[x] 

/3n[x] to a[x] => l3[x], then 

it is also absolutely reliable to reason from ao[x] /?o[^], • • • , (^n[x] 

/3nN,7N a[x] => (3[x]. 

8.4 The Explication of Justified Inference 

As emphasized in chapter 6, we regard the justification of inferences to be 
dependent on whether the inference concerned is basic or acquired. Moreover, 
we do not really define justifiedness for inference processes now, i.e., sets of 
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trajectories of parameter-settings of A, but we rather define justified- inference 
ascription by means of a justified-inference operator: 

Let J be a fixed unary operator; let Cj he the set of formulas of the 
form J{a[a] ^inf /?N) J{a[a] ^inf (3[a]) for a [a], /3[a] G C. Since whether 
A draws a justified inference or not, depends on A’s cognitive history, we can 
only ascribe justified inferences relative to a sequence t — nAtc , . . . , t of points of 
time at which A draws the inference (recall section 4.6), and relative to a “cog- 
nitive history” of A, s.t. {s{t — nAtc), . . . , s(t)) is the subsequence of 

{^{^))tein time t — nAtc to time t. So we say: 

• (Notation for Justified Monotonic Inference Ascription) 

{t — nAtc^ • • • 5 ^) 5 (^(0)tG/n ^ ^inf /?N) if o^ly if A draws 

from t — nAtc to t justifiedly relative to a monotonic inference 

from the belief that [a[a] is true] to the belief that [/3[a] is true]. 

The right side of the equivalence may be circumscribed by: 

A draws from t—nAtc to t a relatively to (s(t))^^^^ justified monotonic inference 
from a [a] to /3[a]; 

or: it is justified relative to that A draws from t — nAtc to t 

a monotonic inference from a[a] to /3[a]. 

• (Notation for Justified Nonmonotonic Inference Ascription) 

{t — nAtc, • • • , {s{t))^^j^ N J{a[a] ^inf P[o]) if and only if A draws 

from t — nAtc to t justifiedly relative to (s{t))^^j^ a nonmonotonic infer- 
ence from the belief that [a [a] is true] to the belief that [f3[a] is true]. 

This can be read equivalently as: 

A draws from t — nAtc to t a relatively to {s{t))^^j^ justified nonmonotonic 
inference from a[a] to /3[a]; 

or: it is justified relative to {s{t))^^j^ that A draws from t — nAtc to t 
a nonmonotonic inference from a [a] to j3[a]. 

So, filling in the formal details of our informal explication of the notion 
of justified inference in chapter 6, we have: 

Definition 54 (Justified Inference Ascription) 

1. {t - nAtc, (s(0)te/n <^(o;[a] ^m/ f3[a]) iff 

(a) {s{t - nAtc), ■■■, s{t)) 1= a[a] -^inf P[a] 

(b) (Primary Justification I) 

it is absolutely reliable to draw an inference from a [a] to /3[a] 
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(c) (Basic Inference Case) 

if A draws from t — nAtc to t rei to {s{t))^^j^ the basic monotonic 

inference from a[a] to /3[a], then: 

there are a[a] = 70 [a], 71 [a], ... ,7^ [a] = P[a] G C, s.t. 

i. {s{t - nAtc), l)Atc)) 1 = 70 H 7i[«] 

a. {s{t - (n - l)Aic), 2 )Atc)) N 7 i[a] 72H 

in. {s{t - Ate), ■■■, s{t)) 1= 7„_i[a] 7 nH 

and for all k with 0 ^ k ^ n — 1: 

(Primary Justification II) 

it is absolutely reliable to draw an inference from 7fc[a] to 7^+1 [a] 

(d) (Acquired Inference Case) 

if A draws from t— nAtc to t rei to {s{t))^^j^ the acquired monotonic 

inference from a[a] to f3[a]y then: 

there are a[a] == 7o[a],7i[a], ... ,7^ [a] = / 3 [a] G C, s.t. 

i. {s{t - nAtc), ...,s{t-{n- l)Aic)) 1 = 7o[a] ^inf«- 7iN 
a. {s(t -{n- l)Atc), ...,s{t-{n- 2 )AQ) N 7i[a] 72H 



in. {s{t - Ate), ■■■, s{t)) ^ 7„_i [a] 7 nN 

and for all k with 0 ^ k ^ n — 1: 

(Primary Justification II) 

it is absolutely reliable to draw an inference from 7fc[a] to 
and 

if A draws from t — {n — k)Atc to t — {n — k — 1 ) Ate to {s(t))^^j^ 

the acquired direct monotonic inference from 7/cH to 7^4-1 [a], then: 

(Secondary Justification) 

there is an amount At = Ato + Ati of time, 

there are ao[a],/?oH, • • . , [a] , [a] G C, s.t. 

{s{t — {n — k)Atc — At), . . . , s(t — (n — k)Atc)) N 

ao[a] A /?o[a], . . . , a^[a] A / 3 ^[a] -^ind 7 /cN '^here 

for every i G { 0 , . . . , m}, 

{s{t — (n — k)Atc — Ato — Ati), . . . , s(t — (n — k)Atc — Ati)) N 
ai[a] A l3i[a] ai[x] (3i[x], 
and 

{s{t — (n — k)Atc — Ati), . . . , s{t — {n — k)Atc)) N 

^ ^ /^m[^] '^mon ^ 7fc+l[^]? 

S.t.: 
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• for every z G {0, . . . , m}: 

A justifiedly acquires in 

{s{t — {n — k)Atc — Ato — Afi), . . . , s(t — (n — k)Atc — Ati)) 
by a learning process the strict belief that [ai[x] /3i[x] is true] 
from having in at least one parameter- setting s{t — {n — fc) Ate — 
Ato — Ati) to s{t — (n — k)Atc — Ati) either (i) the singular 
belief that [ai[a] A /3i[a] is true] or (ii) the singular belief that 
]-^ai[a] is true], while having in no such parameter- setting the 
singular belief that ]ai[a] A -^/3i[a] is true] 

• it is absolutely reliable to reason from ao[x] /3 q[x], , am[x] 

I3m[x] to 7fe[x] ^ 7fc+lN 

2. {t — ■ ,t) , (s(0)te/n ^ •/(<![£*] ^inf iff 

(a) {s{t - nAtc), s{t)) \= a[a] ^i„f j3[a] 

(b) (Primary Justification I) 

it is highly reliable ( either in the probabilistic sense or in the normic 
sense) to draw an inference from a[a] to P[a] 

(c) (Basic Inference Case) 

if A draws from t — nAtc to t rel. to {s{t))^^J^ the basic nonmono- 
tonic inference from a[a] to (3[a\, then: 
there are a[a] = 7o[a],7i[a], . . . ,7n[a] = P[a] G C, s.t. 

i. {s{t - nAtc), ...,s{t-{n- l)Atc)) t= 7oH ^inf^ 7iN 

ii. {s{t - (n - l)Atc), ...,s{t-{n- 2) Ate)) N 71 [a] ^infd 72H 

Hi. {s{t - Ate ), . . . , s{t)) N 7n-i[a] ^n[o] 

and for all k with 0 ^ k ^ n — 1: 

(Primary Justification II) 

it is highly reliable ( either in the probabilistic sense or in the normic 
sense) to draw an inference from 7/e [a] to 7/e+i[a] 

(d) (Acquired Inference Case) 

if A draws from t — nAtc to t rel. to (^(^))te/n acquired non- 
monotonic inference from a[a] to f3[a], then: 

there are a[a] = 7o[a],7i[a], . . . ,jn[o] = P[a] G C, s.t. 

i. {s{t - nAtc), l)Atc)) 1= 7o[a] 7iH 

ii. (s(t - (n- l)Atc),.. .,s(t- (n-2)Atc)) 1= 71 [a] 72(0] 



Hi. (s(t ■ j s(i)) 1= 7n— 1 [®] 
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and for all k with 0 ^ k ^ n — 1: 

(Primary Justification II) 

it is highly reliable ( either in the probabilistic sense or in the normic 
sense) to draw an inference from jk[o] to 7/^+1 [a], and 
if A draws from t — {n — k)Atc to t — {n — k — l)Atc rel. to {s{t))^^j^ 
the acquired direct nonmonotonic inference from 7fc[a] to 
then: 

(Secondary Justification) 

there is an amount At = Ato At i of time, 

there are ao[a], /3o[a], . . . ,am[o], Prn[o] ^ s.t 

{s{t — {n — k)Atc — At), . . . , s{t — {n — k)Atc)) N 

ao[a] A /9o[a], . . . , am[a] A f3m[o] ^ind 7k[x] => lk+i[x], where 

for every z G {0, . . . , m}, 

{s{t — {n — k)Atc — Ato — ^^1)? . . . , — (n — k)Atc — Ati)) N 

ai[a] A (3i[a\ ai[x] Pi[x], 
and 

{s{t — {n — k)Atc — Ati), . , . , s{t — {n — k)Atc)) N 

^o[^] /^o[^]? • • • 5 ^m[^] '^mon ^ 

S.t.: 

• for every z G {0, . . . , m}; 

A justifiedly acquires in 

{s{t — [n — k)Atc — Ato — Ati ), . . . , s{t — {n — k)Atc — Ati)) 
by a learning process ( case 1 ) the defeasible belief that [ai [x] => 
Pi[x] is true] from having in most parameter- settings s{t — {n — 
k)Atc — Ato — Ati) to s{t — (n — k)Atc — Ati) either (i) the 
singular belief that [ai[a] A /3i[a\ is true] or (ii) the singular be- 
lief that ]-^ai[a] is true], while having only very rarely, if so at 
all, the singular belief that ]ai[a] A ^pi[a] is true], (case 2) the 
strict belief that ]ai[x] Pi[x] is true] given the same condi- 
tions as stated above for the justified acquiring of ai [x] Pi [x] 

for monotonic inferences 

• it is absolutely reliable to reason from ao [x] po[x], ... , 

Oim[x\ /5m N to 7fc[a;] ^ Tfc+iN- 

Analogous definitions can be given for justified deductive/high prob- 
ability /normic inference ascription (by means of expressions like J{a[a] -^inf 
P[a\), J{a[a] =^hpinf P[o]), and J{a[a] ^ninf P[o]))‘ def.54 we have defined 
justifiedness of inferences by demanding both primary and secondary justifica- 
tion for the justified inference itself and, if the inference is indirect, for each 
component inference for at least one way of regarding the inference as a com- 
position of direct inferences; secondary justification is relevant just in the case 
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of acquired inferences, of course. But other choices would also have been pos- 
sible, like, e.g., demanding only primary justification also for the justification 
of acquired inferences, or demanding only secondary justification for the jus- 
tifiedness of acquired inferences. E.g., Schmitt [142], p.376, writes: “I have ... 
critized Gk)ldman’s reliabilism for methods on the ground that the requirement 
of a metareliable [i.e, second-order reliable] . . . process selecting a method is 
too strong. I observed that intuitively a person who acquired the common al- 
gorithm for arithmetical sums could form justified beliefs by using it even if 
he or she acquired it by an accident like bumping his or her head and thus 
even if its use did not result from a metareliable selection process. It should 
also be said that Goldman’s requirement that the method be [first-order] reli- 
able appears to be too strong: an unreliable method for multiphcation could be 
justifying if selected by a metareliable selection process ... a reliable method 
can be sufficient for justification, and a combination of a metareliable selection 
process and an unreliable method can together make a single reliable process 
sufficient for justification.” Obviously, there is a clash of intuitions here which 
is again related to the generality problem. We cannot see how such differences 
concerning intuitions could be resolved by a recourse to our pre-theoretical no- 
tion of justification, and the best one can do seems to be open to all possible 
directions. Def.54 is but one in a family of admissible explications of the notion 
of justified inference. 

Any further explication of the notion of justification for learning pro- 
cesses has to be left open as explained in section 6.3. 

8.5 Two Consequences of the Theory 

By def.54 and def.50, justification for both basic and direct deduct ive/high 
probability/normic inferences may be characterized quite plainly in terms of 
the truth of the general beliefs that they are based on: 

Corollary 55 (Justification and Truth) 

1. If A draws from t — Ate to t rel. to is{t))^^J^ the basic direct deductive 
inf erence from a[a] to /?[a], then: 

{t - Ate, t= J(a[a] /3[a]) iff 

VJtact 1= a\x\ P[x\ 

2. if A draws from t — Ate to t rel. to {s{t))^^j^ the basic direct high proba- 
bility inference from a [a] to ^[a], then: 

{t - Ate, {s{t))tein ^ ^hpinf“ /?H) iff 

-^act ^ 0^[x] /^[^] 

S. if A draws from t — Ate to t rel. to {s{t))^^j^ the basic direct normic 
inference from a[a] to /?[a], then: 
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{t - Ate, (s(i))tg/„ t= J{a[a] /3[a]) W 

^f^act ^ 0;[x] nor /^[^]* 

In the special cases referred to in corollary 55, the justification of a 
certain kind of inference from a singular belief to another singular belief is thus 
reducible to the truth of the general sentence that expressed the content of the 
general belief on which the inference based. Note that this is not (necessarily) 
the case for acquired inferences, for indirect inferences, and also not for those 
inferences which are not based on universal, high probability, or normic beliefs. 
But for basic inferences we do not have to check the ‘acquired’ clause in def.54; 
for direct inferences we need not look for ways in which an inference might be 
composed of subinferences which have to be justified themselves. 

More generally, we have: 

Corollary 56 ( Justification and Truth II) 

1. If A draws from t — Ate to t rel. to {s{t))^^j^ the basic direct monotonic 
inference from a[a] to p[a], then: 

{t - Ate, 1= J{a[a] f3[a\) iff 

Tlact 1= a[x] ^ P[x] 

2. if A draws from t—Atc to t rel to («(0)tG/n basic direct nonmonotonic 
inference from a[a] to f3[a], then: 

(t — Ate, . . . , t) , ^ ^(o[a] ^inf iff 

^ o[x] hp /^[^] ^ o[x] ^^nor 

The inferences referred to in the latter corollary are not necessarily 
based on general beliefs the contents of which may be expressed by either 
universal, or high probability, or normic conditionals. 

As a further consequence of our theory, the nonmonotonicity of high 
reliability leads to what might be called with Rescher[133] the “optimum insta- 
bility” of justified nonmonotonic inference with respect to premise beliefs, i.e., 
what is “optimally” inferred given a certain premise belief is not necessarily also 
“optimal” given a logically strengthened premise belief. Nonmonotonic infer- 
ences are thus also nonmonotonic in an epistemological sense, since inferences 
may seize to be justified under addition of premises: 

Corollary 57 (‘^Optimum Instability” of Justified Nonmonotonic Inference) 
The justifiedness of nonmonotonic inferences is nonmonotonic, i.e., it 
is not necessarily monotonic: there may be a[a], /3[a], j[a] G C, t, t' G In, s.t.: 
{t - nAte, ■■■,t) , (s(i))te/n ^ =^m/ I3[a\), but 

{f ~ nAte, . . . ,t') ,{s{t))^^l^i^ J{a[a] Ayla] P[a\). 
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This is implied by def.54 together with corollary 51. E.g., recall what 
we have outlined in the cat&bird example on p. 20 - we quote: “A [the cat] is 
justified to infer that the bird can fly, since A believes that there is a bird in 
front of her, and - by stipulation - A does not have any other relevant belief 
(e.g., a belief undermining this inference, or contradicting the conclusion); thus, 
lacking any relevant counterevidence, it is plausible to assume for A that the 
bird can fly, and if A draws this inference, the inference will be justified. But 
if A also had the perceptual belief that the bird in front of her was actually 
a penguin, then A would not infer justifiedly that the bird can fly.” This is 
an instance of the optimum instability referred to in corollary 57; the intuition 
expressed in the cat&bird example can now easily be backed up by theoretical 
means which are part of our theory of justified inference. 

Rescher[133], p.26, actually speaks of the “optimum-instability” of ra- 
tionality with respect to circumstances c, where circumstances are considered 
to be epistemic. In the typical case, c will be a set of beliefs of an agent. Accord- 
ing to Rescher, rationality demands optimization, but optimization is always 
relative to circumstances. An alternative x may be optimal as far as the agent’s 
attainment of a goal g is concerned given the set c of beliefs, although x may 
indeed not be optimal as far as the agent’s attainment of g is concerned given 
a set c' of beliefs, where c' is a proper superbelief of c. Put differently: the 
rationality of alternatives may change in the light of new evidence.^ This is 
something that rationality and low-level justification have in common. 

In philosophy of science, this nonmonotonicity phenomenon is well- 
known both in the context of inductive reasoning and in the context of sta- 
tistical explanations. It has led Carnap to introduce his ‘requirement of to- 
tal evidence’: a particular inductive argument should only be applied by an 
agent if its premises comprise the agent’s total knowledge. This requirement 
is a methodological maxim that should govern every rational application of 
inductive logic. In the context of statistical explanations, Hempel has replaced 
Carnap’s rule by the related though more complex rule of maximal specifity.^ 

8.6 Ideal Agents 

Finally, we can define ideal (or perfect) agents as such agents which never 
draw unjustified inferences, ‘ideality’, or ‘perfection’, is thus only meant with 
an epistemological connotation here and only as regarding the inferential ac- 
tivities of an agent, i.e.: ‘ideal’ is meant to be synonymous to ‘epistemologically 
inferentially ideal’. Just as justification has been defined relative to the cogni- 
tive history of the agent, also this special kind of epistemological perfection is 

^ Reseller’s discussion of the optimum instability of rationality has been related to Non- 
monotonic Reasoning by Leitgeb[89]. 

^For a discussion of both rules see, e.g., Stegmiiller[169], chapter IX, sections 7 and 8. See 
Schurz[146], pp.8-9, and Schurz[147], pp. 538-539, for a discussion of how Carnap’s principle 
of total evidence is related to valid nonmonotonic probabilistic reasoning. 
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defined relative to a cognitive history of A: 

Definition 58 (Ideal Agents I) 

A is ideal with respect to {s{t))^^j^ iff 

for all a [a], f3[a] € C, for all points t — nAtc , . . . , t of time in In: 

1. {s{t - nAtc), • • • , s{t)) 1= a[a] (3[a\ iff 

{t - nAtc, (s(t))tg/„ 1= J(a[a] ^i„f P[a]) 

2. {s{t - nAtc), ■■■, s(t)) \= a[a] (3[a\ iff 

{t - nAtc, ■■■,t), (s(t))(e/„ 1= J{a[a] =>inf /d[a\). 

Ideality is by itself only a weak notion of epistemic appraisal, since, 
e.g., an agent A that does not draw any inference at all within is 

trivially ideal with respect to its cognitive history. But ideality in combination 
with what Goldman calls ‘power’ (cf. section 22.2 in the appendix), i.e., with 
high inferential activity, is an epistemological paragon. It is the counterpart to 
Carnap’s high-level “idealized person” referred to in Carnap[30]^ within the 
context of our low-level theory of justified inference. 

If A is ideal with respect to every (possible) cognitive history 
i.e., every sequence (^(i))tein conforms to the system laws of A, then we 
call A ideal simpliciter: 

Definition 59 (Ideal Agents II) 

A is ideal iff 

for all trajectories {s{t))^^j^ of A: A is ideal with respect to {s{t))^^j^. 

E.g., let A have in all possible parameter-settings the same general, say, 
universal, high probability or normic beliefs, s.t. the latter beliefs are basic with 
respect to every trajectory. If each of these beliefs is true, then, by corollary 
55, A only draws justified inferences on the basis of such general beliefs. Thus 
A is ideal. We are going to study ideal agents of such a type in part IV. 



^Carnap[30], p.307: Rational credence is to be understood as the credence function of 

a completely rational person X; this is, of course, not any real person, but an imaginary, 
idealized person. We carry out the idealization set for step, by introducing requirements of 
rationality . . 
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Chapter 9 

THE SEMANTICS OF DEDUCTIVE AND NONMONOTONIC 

LOGIC 



In part I we have explicated the notions of a monotonic or nonmonotonic infer- 
ence; in part II we have stated a theory of justification for both of these types 
of inference. In this part we will deal with the logic that is underlying justified 
monotonic and nonmonotonic inference. This logic is deductive logic in the first 
case, and nonmonotonic logic in the second one. 

Logic enters the stage, so to speak, “by the side entrance” . Within our 
theory of justified inference we have of course presupposed classical logic both 
on the level of the theory and on the metalevel on which we have defended 
our theory. But what is much more relevant: as we are told by this theory, the 
justifiedness of an inference is related to the reliability of the inference, and 
also to the reliability of the processes that it is based on. Reliability has in 
turn been defined by recourse to phrases like ‘®Tact ^ /3[x]\ ^dJlact ^ 

(^[x] ^hp 0[x]\ or ‘DJlact <^[x] =>nor but we have not yet explained 

the model that we have referred to by which satisfies universal, high 

probability, and normic conditionals. What does this model, or models like it, 
look like, and what does it mean to say that such a model satisfies the latter 
conditionals? Put in a nutshell: apart from some intuitive suggestions, we have 
not yet given any semantics for the conditionals that we need in order to 
define the qualitative notions of reliability that are presupposed by our theory 
of justified inference, and it is only in this sense that we are now going to deal 
with the logic “of” justified monotonic and nonmonotonic inference. 

The question of whether a formal semantics is adequate for a certain 
fragment of language is to be settled by matching the semantics to the pre- 
theoretical semantical intuitions that govern the linguistic average behaviour 
of the community of speakers that are relevant and competent with respect to 
this particular language fragment; such intuitions usually both have descriptive 
and normative aspects. E.g., in our case, the fragment of language that we 
are interested in consists of expressions like a[x] /3[x]^ a[x] ^hp and 
(^[x] ^nor l3[x]^ or, rather, their natural language counterparts of the form 
‘all as are (3s\ ‘(by far the) most as are /3s’, and ‘normal as are /3s’. The 
logical entailment relation that is to be defined by the formal semantics for 
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a[x] —^/3[x], a[x] ^hp and a[x] =^nor P[^]i is thus to be matched with our 
informal pre-theoretical entailment relation holding between sets of expressions 
of the form ‘all as are /?s’, ‘(by far the) most as are /3s’, or ‘normal as are /3s\ 
The set of logically valid formulas is to be compared with the intuitively valid 
ones, and so on. This matching is not demanded to be perfect since the pre- 
theoretical semantical intuitions of the relevant speech community are usually 
vague and incoherent, but at the end we should have good reasons to believe 
that the formal semantics is actually telling us what our semantical intuitions 
ought to be like. 

In sections 9.1, 9.2, and 9.3 we will suggest various versions of a se- 
mantics for universal, high probability, and normic conditionals. At the end of 
those sections we will compare the strength of the different versions in terms 
of their notions of entailment, validity, etc. In chapter 10 we will present the 
logical systems that have been suggested - to some extent independently of an 
intended formal background semantics - as systems of valid rules of inference 
for universal, high probability, and normic conditionals. The soundness and 
completeness results stated in chapter 11 “prove” the intuitive considerations 
that have led to these systems to be correct, or, put the other way round, the 
results of chapter 11 support the thesis that the semantical systems defined in 
sections 9.1, 9.2, and 9.3 are adequate. 

In chapter 12 we will discuss some consequences of the results stated 
in section 11 for our theory of justified inference. 

Before we turn to the formal semantics of the conditionals that we 
are interested in, let us first recall some facts from chapter 2: our fixed factual 
language C is essentially a propositional language, since every atomic formula of 
C is of the form P{a) and may be identified with a propositional variable; C does 
not contain quantified formulas but only consists of sentential compositions of 
formulas by connectives of propositional logic. In this part we will therefore 
view C mainly as a propositional language, we will speak of ‘propositional 
variabes’ etc. The vocabulary of C has been assumed to contain only finitely 
many (and only unary) predicates P and just one individual constant a, and 
thus - considered as a propositional language - C is logically finite, i.e., it 
contains only finitely many propositional variables. We are again going to use 
also ‘a’, ‘/3’, ‘7’, etc. as metavariables over the formulas of C (instead of ‘a[a]’, 
‘/3[a]’, ‘7[a]’, etc.). 

Dact has been defined as the domain of objects in our envisioned epis- 
temological sandbox territory, whereas the metavariable ‘D’ has been used to 
range over arbitrary but non-empty universes of discourse; Dact is one of these 
universes. For every interpretation function 3 that is given relative to some 
D, i.e., where 3 assigns a subset of D to every predicate P in the vocabulary 
of >C, we have defined satisfaction for the formulas of £, i.e., the conditions 
in which (J, d) N ip. The actually intended interpretation mapping 3act from 
chapter 2 has been defined relative to Dact • We have furthermore assumed that 
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for very interpretation mapping 3 relative to D and for every object d ^ D 
there is an object-description of d, s.t. d is uniquely logically characterized by 
some formula in C that is satisfied by d and only by d. Therefore, for a given 
interpretation 3, we may identify every d e D with a propositional variable 
setting w = w:}{d) (called a “world”) for the propositional variables of s.t. 
w satisfies a propositional variable p iff (J, d) N p. For given 3, we can thus also 
simply write that w \= ip {3^d) \= ip. D might therefore be identified with 
a set IT = W:^{D) of worlds for C. Note that such a set W of worlds is not 
necessarily identical to the set of all possible worlds (i.e., propositional variable 
settings) for C. Similarly, we can identify every d G Dact with w = 
and Dact with Wact = ^ 3 act(^act)- Every model that we are going to consider 
in the following sections is based on a given set W of worlds, and thus on a 
given interpretation mapping 3. 

In chapter 2 we have referred to the formulas a[x] as those formulas 
that we get by substituting x for a in a[a] G C. Let be again the set of uni- 
versal conditionals a[x] /3[x], the set of high probability conditionals 

a[x] ^hp l3[x]^ and the set of normic conditionals a[x] =^nor /?N (for 

some a[a],p[a] G C). Recall that £_^, propositional lan- 

guages but rather fragments of languages with quantifiers, since the connectives 
=>nor are actually quantifiers binding x. Such x- formulas (Adams[2], 
Schurz[147]) have been extended to formulas with more than one variable by 
Schurz and Adams in [152], but the corresponding logical systems are not yet 
fully understood - e.g., the probability semantics for these extended languages 
still lacks a complete proof theory. £^, contain conditionals, 

not propositional constructions based on conditionals. 

Let us now presuppose a fixed set W of worlds (IT might be identical 
to Wacti but it is not necessarily so). Most notions that are defined in the 
subsequent three sections will only be given relative to IT, though - for brevity 
- not always explicitly stated as such; thus we do not have to introduce indices 
referring to the set of worlds relative to which the semantical systems are 
developed. 

9.1 Semantics for Universal Conditionals 

The semantics for universal conditionals is well-known and standard but for the 
following exceptions: (i) a classical semantics is usually introduced for full first- 
order predicate logic, whereas we restrict ourselves to formulas of the special 
form a[x] /3[x] that contain the universal quantifier V only implicitly; (ii) we 
define logical entailment and validity for universal models relative to a fixed set 
IT of worlds, since this will be convenient in the light of the following sections 
on the semantics of high probability and normic conditionals; the so-defined 
semantical notions correspond to their usual classical counterparts only if W 
is identical to the set of all possible worlds for C: 
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Definition 60 (Semantics for Universal Conditionals) 

1. A universal model for universal conditionals is a subset of the given 
set W of worlds 

2. relative to a universal model we can define: 

a[x] ^ P[x] 

iff for all w G if w \= a[a] then w N p[a] 

(i.e.: all worlds that satisfy a[a] also satisfy P[a], or, more briefly: all as 
are /3 s) 

3. let = {a[x] fl[x] \m^ a[x] ^ fl[x] }; 

TH^{TC) is the universal theory corresponding to DJfC 

4 . a[x] fl[x] is universally valid iff 

for every universal model VJt^: DXC a[x] /3[x] 

5. let C C^, let DJl'^ be a universal model: 

for every a[x] /3[x] G KB^ it holds that 9JC 1=^ a[x] /3[x] 

6. let KB^ C let a[x] f3[x] G C^: 
we say that 

KB^ o[x] /3[x] 

(KB^ universally entails a[x] /3[x]) iff 

for every universal model : 

if^u 

Let ‘1=’ express the usual (classical) logical satisfaction/implication re- 
lation; accordingly, we will use ‘h’ to express the (classical) logical derivability 
relation. We may regard TC a[x] fl[x] and 971^ b a[x] — > /3[x] as equiva- 
lent. 



9.2 Semantics for High Probability Conditionals 

We need the following preliminary definitions before we can state the different 
versions of probability semantics for high probability conditionals: 

Definition 61 (Probability Spaces) 

1. (W, 21, Pro6) is a probability space iff 
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(aj Qi is a a -algebra on W , i.e. 

2 . 2t C p{W) 
a. 0, ly G 

Hi. ifXe^ then W\X 

iv. if X \ , X 2 , X^ 5 • • • G 21 then Xi G 21 

ien 

(b) Prob is a probability measure on 21, i.e. 

i. Prob : 21 ^ [0, 1] 
a. Prob{W) = 1, Prob{0) = 0 

Hi. if Xi,X 2 , X 3 , . . . G 2t, s.t. Xi n Xj = 0 for all i ^ j eN, then 

Prob(\jX,] = ZProb(Xi) 

\ien J ieN 

2. since W is - by the assumption on C above - finite, it is useful to use 
p{W) as the a-algebra of every probability space that we are going to 
consider now, and this will be presupposed in the following. Instead of a 

-algebra^ we can now simply speak of a (Boolean) power set algebra; 
instead of a -additivity only finite additivity is relevant, i.e.: 

for all X,Y CW, s.t. XnY = 0: Prob {XUY) = Prob{X) + Prob(Y). 

For the given set W, a probability space may thus be simply identified with 
its probability measure Prob that we will always regard as being defined 
on p{W) 

3. since W is a set of propositional variable assignments for C, we may use 
Prob to assign probabilities also to formulas of the form a[x] (but we use 
the same function sign ‘Prob’ again) in the following way: 

(a) for a e C, let [a] = {w E W \ w \= a} 

([a]e^= p{W)) 

(b) let a G C: 

Prob{a[x]) := Prob{[a\) 

(c) let a, P e C: 

Prob{(3{x\ |a[x] ) := Proh{[0\ |[a] ) = if Prob{[a\) ^ 0 

Prob{P[x\ |o:[x]) := 1, otherwise 

(Prob{P[x]\a[x]) is the probability of P[x] given a[x]; we also say 
that 

Prob{P[x] |o;[x] ) is the conditional probability that is associated with 
the high probability conditional a[x] ^hp P[A) 
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(d) let a, f3 e C: 

Unc{P[x] \a[x] ) := 1 — Prob{P[x] |a[x] ) 

(Unc{P[x] |o;[x]) is the uncertainty of P[x] given a[x]; we also say 
that 

Unc{/3[x] |o;[x]) is the uncertainty that is associated with the high 
probability conditional a[x] =^hp P[^])’ 

We will now present different kinds of semantics for high probabil- 
ity conditionals, where each semantics - except for the infinitesimal// seman- 
tics - is, essentially, based on some probability semantics that has been sug- 
gested by Adams (see e.g. [1], [2], [3], [6]). Adams’s semantical systems have 
been further developed by Pearl (see e.g. [118], [120]), Lehmann&Magidor[88], 
Goldszmidt[74], Schurz (see [146], [147], [148], e.g., for the system extending 
the system P introduced below), Snow[164], and Bamber[16]. 

Each of the semantical systems below includes the definition of an 
entailment relation that holds between sets of high probability conditionals 
and further such conditionals. But not every semantics that we are going to 
deal with is based on the notion of truth of a high probability conditional in 
a model. Since we have presupposed such a notion in parts I and II, those 
semantical systems by which truth for high probability conditionals is actually 
defined are of particular relevance for our concerns. However, the other kinds 
of probability semantics are still relevant for us in so far as they throw some 
light on the former types of semantics. Let us call a semantical system with a 
satisfaction relation that holds between a model and a conditional a ‘first-order’ 
semantics. Let us furthermore call a semantical system with an entailment 
relation holding between a set of conditionals and a conditional a ‘second- 
order’ semantics. The semantical systems which we are going to turn to are 
either first-order semantical and second-order semantical, where entailment is 
derivative on satisfaction, or they are purely second-order. 

Except for the sequence semantics and the majority semantics (see 
below), each of the subsequent semantics has been defined in the literature 
cited above. The sequence semantics has been hinted at in a footnote on p.277 
of Adams [4], but - to the best of our knowledge - it has nowhere been stated 
explicitly before; the majority semantics does not seem to have been addressed, 
either. 



Following Schurz we call the following second-order semantical system 
“infinitesimal” ; since we are going to introduce a further “infinitesimal” seman- 
tics later, we add the subscript ‘I’. According to this semantics, a set of high 
probability conditionals entails a further high probability conditional if and 
only if: the higher the probabilities of the conditionals contained in the set, the 
higher also the probability of the conditional to be entailed. This leads to a 
kind of “continuity” semantics for high probability conditionals employing an 
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€-(5>criterion. The quantification over probability measures that is employed by 
the following semantics corresponds, roughly, to the quantification over classical 
models in the well-known super valuation semantics of vague predicates: 

Definition 62 (Infinitesimal Semantics for High Probability Conditionals I) 

1. Let C is a probabilistic conditional knowledge 

base), let a [a;] P[x] e 

we say that 

KB^hp ^^nfl ^hp P[x] 

infinitesimally I entails a[x] =^hp P[^]) iff 

for all e > 0 there is a 5 > 0, s.t. for all probability measures Prob: 

if for all <p[x] ^hp ^ KB=^^^ it holds that Prob{'iJj[x] \(p[x] ) 1 — 6, 

then Prob{p[x] \a[x ] ) > 1 — e 

(i.e.: if Prob{'ip[x] \(f[x]) is 'high'' for all (p[x] ^hp ^ 
also Prob{/3[x]\a[x]) is ^high”) 

2. a[x] ^hp is infinitesimally I valid iff 
0 Kn// ^hp f3[x] 

3. is a high probability conditional theory in the infinitesimal^ sense 

iff 

for all a[x] ^hp P[x] £ ff Hnfi “N ^hp P[x] then 

a[x\^hp !3 [x\£T'H^^^. 

The latter infinitesimal semantics can be restated in terms of a first- 
order sequence semantics, where a sequence of probability measmes is defined 
to satisfy a high probability conditional if the conditional probability associated 
with the conditional is identical to 1 “in the limit” of the sequence: 

Definition 63 (Sequence Semantics for High Probability Conditionals) 

1. A probabilistic sequence model probability conditionals is a 

sequence {Probn)^^^ of probability measures 

2. relative to a probabilistic sequence model = {Probn)^^^ we can 

define: 

Oi[A =^hp fi[x] 

iff the real sequence {Prohn{l3[x] converges, and 

lim Probn{l3[x\ |o[x] ) = 1 

n— >co 

(i.e.: Probn{(i[x] |o[x]) ^fends” to be 1 for increasing n) 
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3. let = [a[x\ N?,, a[x] ^^p f3[x] }: 

'^'H^hpi^^eq) probability conditional theory corresponding to 

a[x] =>hp P[x] is sequence- valid iff 

for every probabilistic sequence model ^^seq ^hp /?N 

5. let C let be a probabilistic sequence model: 

OTL, iff 

for every a[x] ^hp P[x] G KB^^^ it holds that a[x] =>hp P[x] 

6. let C let a[x] -^hp /?N G 

we say that 

\=l^q a[x] =^hp I3[x] 

(KB^^^ sequence- entails a[x] ^hp P[x]) iff 

for every probabilistic sequence model 

if^seq ^leq then hP^g a[x] ^hp !3[x\. 

Now we turn to a noninfinitesimal second-order high probability se- 
mantics. Schurz[146], [147], [148] emphasizes the practical importance of the 
following semantics according to which a set of high probability conditionals 
entails a high probability conditional if the uncertainty associated with the 
latter is smaller-than-equals the sum of the uncertainties of the conditionals 
contained in the entailing set, i.e.: if the uncertainty of the conditional to be 
entailed is bounded by the uncertainties that are associated with the premise 
conditionals in a certain way. Contrary to the infinitesimal semantics above, if 
a set of high probability conditionals entails another such conditional, there is 
a lower bound for the probability that is associated with the conclusion, and 
this lower bound can additionally be computed easily*: 

Definition 64 (Noninfinitesimal Semantics for High Probability Conditionals) 

1. Let KB^^^ C let a[x] ^hp /?N G 

we say that 

KB^,^ a[x] ^Hp 0[x] 

(KB^^^ noninflnitesimally entails a[x] ^hp l^[x\) iff 
for every probability measure Prob:^ 

* As Schurz[147] points out, the noninfinitesimal (this is again his term) entailment relation 
approximates the so-called ‘quasi-tightness’ property of inferences that has been defined in 
Frisch&Haddawy [52] . 

sum over an empty set of indices is now and in the following always defined to be 0. 
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Prob{P[x] \a[x]) ^ 1 — ^ Unc{'ip[x] i.e., 

hp'iJj[x]E:K 

Unc{/3[x] jafa:] ) ^ ^ [/nc(V^[x] \(f[x]) 

(p[x]^hp'ip[x]eKB=^^^ 

(i.e.: Prob{f3[x] |o![x]) is ^%igh” if the uncertainties Unc{'ijj[x] \^[x]) are 
very 'fow” for all (p[x] =^hp '^[x] G 

2. a[x] ^hp /?N 'is noninfinitesimally valid iff 
0 Kinf =S>hp (3{x] 

3. is a high probability conditional theory in the noninfinitesimal 
sense iff 

for all a[a;] =^hp (3[x] G if a\x] ^hp ^[x] then 

a[x] =^hp l^[x] € . 



According to the following first-order semantics, a high probability 
conditional is satisfied by a certain kind of probability measure that ranks 
objects by polynomial “orders of magnitude” ; a high probability conditional is 
satisfied by such a probability measure if its associated conditional probability 
is of the maximal order of magnitude (compare Snow’s [164] “atomic bound 
probabilities” ) : 

Definition 65 (Order of Magnitude Semantics for High Probability Condi- 
tionals) 



1. A probabilistic order- of -magnitude model for high probability condi- 
tionals is a bijective mapping om :W ^ {0, . . . , card{W) — 1} 

(om{w) is the “probabilistic order of magnitude” ofw) 

2. relative to a probabilistic order- of -magnitude model 931^^ = om, and rel- 
ative to some small v G [0, 1] (say, v <\), we can define: 



(a) let Probom be the unique probability measure that satisfies: 
Probomii'^}) — — v) for om{w) < card{W) — 2, 

Probom{{'^}) = for om{w) = card{W) — 2, 

Probom{{'^}) = 0 for om{w) = card{W) — 1 



(b) 



^om Hm =^hp f3[x] 



iff Probom{f3[x] |a[a;] )^l-v 
(i.e.: Probom{fi[x]\ct[x]) is “high”).t 

^It is easy to see that whether DJlom ^om =^hp or i^ot, is actually independent 
of the selection of v. 
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3. let X be a set of probabilistic order- of -magnitude models: 

let = {a[x] /?NI for G a[x] ^hp 

m-- 

TH^f^p{X) is the high probability conditional theory corresponding to 

4 . a[x] =>hp l3[x] is order- of -magnitude- valid iff 

for every probabilistic order- of -magnitude model a[x] =^hp 

/3[x] 

5. let C letXft^^ be a probabilistic order- of -magnitude model: 

iff 

for every a[x] =^hp P[x] G KB^^^ it holds that 1=^^ a[x] =>hp P[x] 

6. let KB^^^ C let a[x] ^hp /?N G 

we say that 

^om "^hp /?[^] 

(KB^^^ order-of -magnitude- entails a[x] =>hp /3[x]) iff 
for every probabilistic order- of -magnitude model 

nm then mp^ ^P^ a[x] P[x]. 

The next second-order semantics defines a set of high probability con- 
ditionals to entail another high probability conditional, if the probability associ- 
ated with the conclusion conditional cannot be 0 if the probabilities associated 
with the premise conditionals are sufficiently high. If a set of high probability 
conditionals entails a high probability conditional in this sense, the risks of 
reasoning in correspondence to the entailment relationship are limited in the 
following sense: although the probability that is associated with the conclusion 
conditional may be “low” , it is at least guaranteed to be positive as long as the 
probabilities associated with the premise conditionals are “high” : 

Definition 66 (Positivity Semantics for High Probability Conditionals) 

1. Let KB^^^ C let oi\x\ ^hp P[x] G 

we say that 

a[x] =>kp 0[x] 

(KB^^^ positively entails a[x] -^hp !3[x]) iff 

there is an e > 0, s.t. for all probability measures Prob: 

§We associate theories with sets of order-of-magnitude models and not with single order- 
of-magnitude models in this case for the sake of the completeness theorem that is stated in 
chapter 10; compare the footnote on p.l99. 
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if for all (f[x] =>hp ^ Prob{'il;[x] \(p[x] ) ^ 1 - e, then 

Prob{/3[x] |a[x] ) > 0 

(i.e.: if Prob{^^J[x] |v^[a:]) is ^high” for all (p[x] =>hp '^[x] ^ then 

it follows that Prob{/3[x] |ce[x] ) is positive) 

2. a[x] =>hp l3[x] is positively valid iff 
0 >=po5 «N =^hp /?N 

3. is a high probability conditional theory in the positive sense iff 
for all o;[a;] ^hp 0[x] € C^^p- 

a[x] ^hp I3[x] then a[a:] =>hp P[x] € 

The subsequent second-order semantics defines a set of high probability 
conditionals to entail a further high probability conditional, if, whenever the 
probabilities that are associated with the premise conditionals are “near” to 
1 (where ‘near’ is only defined relative to the number of given premises), the 
probabihty associated with the conclusion conditional is larger than so if 
a[x] =>hp f3[x] is this conclusion, then most as are /3s. ‘Most’ is not synonymous 
here to ‘by far the most’ but simply refers to the majority of cases: 

Definition 67 (Majority Semantics for High Probability Conditionals) 

1. Let KB=^^^ = ^hp • • • ? ^hp '^n[x]} ^ 

let a[x] =^hp P[x\ e : 

we say that 

KB^y^^ 1=^ a[x] =^hp 0[x] 

(KB^y^^ majority- entails a[a;] ^hp /?N j iff 
for all probability measures Prob: 

if Prob{‘^i[x] |<^i[a;] ) > 1 - , Prob{ipn[x] \<Pn\x \ ) > 1 - 

then Prob{0[x] |a[a;j ) > 5 

(i.e., if the former probabilities are ^%igh^% then most as are /3s) 

2. a[x] =^hp l3[x] is majority-valid iff 
0 a[x] =>hp /3[x] 

3. TH^y^^ is a high probability conditional theory in the majority sense iff 
there is a finite KB^y^^ C C^y^^, s.t: 

for all a[x] ^hp (3[x] G C^y,^: 

a[x] =>hp P[x] iffa[x] =^hp 0[x] € 
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The next kind of first-order semantics has been suggested by Leh- 
mann&Magidor[88], pp. 48-53, and presupposes real nonstandard analysis. Non- 
standard analysis adds infinitely small numbers (the so-called ‘infinitesimals’) 
and infinitely large numbers to the set of real numbers. Apart from the intro- 
duction to nonstandard analysis that is contained in Lehmann&Magidor[88], 
an account of it may also be found in Chang&Keisler[32], chapter 4.4, and in 
informal terms in Adams[6], pp. 253-256. The idea of an infinitesimal semantics 
for high probability conditionals where nonstandard numbers are involved has 
already been stated by McCarthy [105], p.92, as a justification of his circum- 
scription account of nonmonotonic reasoning: “Since circumscription doesn’t 
provide numerical probabilities, its probabilistic interpretation involves proba- 
bilities that are . . . infinitesimal, within an infinitesimal of one” : 

Definition 68 (Infinitesimal Semantics for High Probability Conditionals II) 

1. An infinitesimali I probabilistic model for high probability condi- 

tionals is a nonstandard probability measure Prob : p(W) [0, 1]*, i.e., 
probabilities are nonstandard reals between 0 and 1 or identical to or 1, 
s.t. ProbfW) = 1 , Prob{ 0 ) = 0, and finite additivity is satisfied 

2. relative to an infinitesimali i probabilistic model = Prob we can 

define: 

^hp /3[x] 

ijf 1 — Prob{P[x] |a[xj) is infinitesimal, i.e., 

for all standard reals e G R with e > 0.- 1 — Prob{(3[x\ \a[x] ) < e 

(i.e.: Prob{P[x] |<a[xj) is either identical to 1 or ‘‘infinitely close” to 1) 

3. let = |o[x] ^hp (d[x\ ^i^fu ^infll ^hp P[x] 

^'^ 9 ^ probability conditional theory corresponding 

4 . a[x] =^hp P[x] is infinitesimally 1 1 valid iff 

for every infinitesimali j probabilistic model ^hp 
p[x] 

5. let C let be an infinitesimali i probabilistic model: 

^fnfll ^fnfll Iff 

for every q;[x] ^^hp /^[^] ^ K B^^^^ it holds that ^ infii '^hp 

P[x] 
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6. let C let a[x] =^hp 0[x] € 

we say that 

^^=>hp ^?n//7 ^hp P[^] 

infinitesimally j I entails a[x] ^hp P[^]) 
for every infinitesimalj j probabilistic model 

^infll ^^nfll ^L/// N ^hp0[x]. 



Let us now compare the different versions of a high probability seman- 
tics concerning their strength. Surprisingly, the semantical systems which we 
have presented in this section turn out to be mutually equivalent in the sense 
specified by the following theorems: 

Theorem 69 (Equivalence of the Different Versions of Probability Semantics 
with respect to Entailment) 

Let KB^^^ C Of[x\ ^hp (d\x\ G 

the following claims are equivalent: 

1. KB=^^^ ^?n/j "^hp P[^] 

2. KB^^^ ^hp P[^] 

3. a[x] 0[x] 

4- KB^^^ t=P^ a[x] -^hp /3[x] 

5. KB^^^ a[x] ^hp (3[x\ 

6. KB^^^ a[x] ^hp P[x] 

7- KB^^^ ^infll ^hp (3[x\- 
Proof: 

The proofs of the equivalence of the claims 1, 3, 5 can be found or 

extracted from Adams [1], [2], chapter II of [3], or, in a more informal pre- 
sentation, from the chapters 6 and 1 of Adams[6]. The equivalence of claim 1 
and 2 was hinted at in a footnote on p.277 of Adams [4] and may be proved 
in the same way as the equivalence of the e-5 -definition of continuity of a real 
function and the corresponding sequence- definition of continuity is proved. The 
equivalence of 6 to the other claims may be seen as follows: 

* {=7* C 1=7* • 

^ninf — ' 

assume that KB^^^ = {ipi[x\ =>hp i>i[x ], . . . , Vn[x] =^hp V’nN}, 
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and ^ninf for every probability measure 

Prob: 

Prob{f3[x] \a[x] ) > 1 — Unc{^p[x] |^^[x] ). 

(p[x]=^ hp'^[x]^rC 

Now let Prob be a probability measure, s.t 

Prob{tpi[x] \^pi[x\) > 1 - ,Pro6(V’„W \^n[x\) > 1 “ 

SO we have that 
Prob{P[x] \a[x ] ) ^ 

1- Y. Unc{'4)[x]\ip\x\) = 

P [x] =>■ [^] B^ 

1 - ^ 1 - Prob{\l)[x] \^[x ] ) > 

<p[x]^hp'4^[x]eKB^y^^ 



1— In— ^ done. 

\ ‘r[^]^hp'^[x]E:K Bz=^ J 

^ C\^P • 

m— ' pos * 

assume that = {</?i[a:] ^hp V'l W> • • • > V>n[x] =>hp ’4’n[x]}, 

and KB=^y^^ a[x] =>hp P[^]y re., for all probability measures Prob: 
if Probi'ipilx] \(fi[x]) > 1 - ^,...,Pro6(^nN l^nN) > 1 ~ 
then Prob{p[x] \a[x]) > 

Thus, there is an e > 0 (use any e < ~), s.t. for all probability measures 
Prob: 

if for all (f[x] =>hp ^ KB^^^ Pro6(^[x] \(p[x] ) ^ 1 - e, 
then Prob{P[x] |a[x] ) > 0; and we are finished again. 



The equivalence of 1 and 1 is, essentially, shown in Lehmann&Magi- 
dor[88], using some of the quoted results of Adams. ■ 

Theorem 70 (Equivalence of the Different Versions of Probability Semantics 
with respect to Validity) 

Let a[x] ^hp /?W e ^^Hpi 

the following claims are equivalent: 



1. a[x\^hp ^[x] is infinitesimally I valid 

2. a[x] ^hp P[x] is sequence- valid 

3. a[x] =>hp P[x] is noninfinitesimally valid 
f. a[x]^hp j3[x\ is order- of -magnitude- valid 
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5. a[x] ^hp l3[x] is positively valid 

6. a[x] =>hp l3[x] is majority-valid 

7. a[x] ^hp (d[x\ is infinitesimallyu valid 

8. Proh{l3[x] \a[x] ) = 1 for every probability measure Prob 

9. W a[x] P[x]. 

Proof: 

The equivalence of 1-7 follow from theorem 69 together with the ob- 
servation that where a valid formula has been defined as being satisfied by 
all models of a certain kindf^, it might equivalently have been defined as be- 
ing logically entailed by the empty set. The equivalence of 1 and 8 is directly 
entailed by the definition of validity in the infinitesimally j sense. The equiv- 
alence of 9 and 8 may be seen in the following way: if W 1=^ a[x] /3[x], 
then Prob{P[x] |o:[x] ) = 1 for every probability measure Prob since every such 
measure is - by presupposition - defined on p(W), and Prob{f3[x] |a[a:] ) = 1 
also in the case that Prob{[a\) = 0. For the converse, note that every Prob 

that assigns 1 to a single world in W and 0 to all other worlds, is a probability 

measure, and therefore if Prob(6\x] lafa:!) = 1 for every probability measure 
Prob, then W a[x] ^[x]. ■ 

Theorem 71 (Equivalence of the Different Versions of Probability Semantics 
with respect to Theories) 

the following claims are equivalent: 

1. is a high probability conditional theory in the infinitesimalj sense 

2. is the high probability conditional theory corresponding to some 

probabilistic sequence model 

3. is a high probability conditional theory in the noninfinitesimal 
sense 

J^. is the high probability conditional theory corresponding to some 

set X of probabilistic order- of -magnitude models 

5. is a high probability conditional theory in the positive sense 

6. TTL^y^^ is a high probability conditional theory in the majority sense 

^But recall that each of these models is based on the same given set W of worlds, i.e., 
validity is relativized to W. 
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7. is a high probability conditional theory in the infinitesimalu 

sense. 

Proof: 

The equivalences are consequences of theorem 69 and the results stated 
in Kraus et al.[85] and Lehmann&Magidor[88] . ■ 

Obviously (by theorem 70), we also have: 

Remark 72 {o:[x] ^ ^ ^ TTL-^{W) ^ is a subset of 

every kind of high probability conditional theory. 



9.3 Semantics for Normic Conditionals 

Let us now turn to the semantics of the formal counterparts of expressions of the 
form ‘normal os are /?s’. In part II we have circumscribed such phrases by ‘the 
most normal objects among those objects that satisfy a are also objects satis- 
fying /?’, presupposing a normality order of objects, or - in the present context 
- of worlds. This is precisely the idea that is underlying the ranked model se- 
mantics for normic conditionals below. However, following Kraus et al.[85] and 
Lehmann&Magidor[88], who have introduced each of the subsequent systems 
of normality semantics, we suggest to allow for a certain generalization of the 
original idea: the normality order will not hold between the worlds themselves, 
but rather between so-called states^^ that are labelled with sets of worlds. 

If every such label is a singleton (as it is the case in the preferential 
model semantics below), the conception of a world-based normality order is 
restored except for the fact that worlds may occur within this order on multiple 
“places” since they may be the labels of different states. In this way it is e.g. 
possible to define a normality order by disjoint union (see e.g. Moschovakis[110], 
p.36) of two or more given normality orders of worlds, where by each of the 
latter orders the worlds are ordered according to their normality in a particular 
respect', the resulting “common” order of the given orders is no longer defined 
for worlds but for states labelled by (singletons of) worlds. 

But such a generalized normal states semantics also allows for further 
interpretations, in particular, if the labels of states are not just singletons but 
if they also include sets of worlds with more than one member: e.g., if si,S 2 
are states that are ordered according to their normality, s.t. Si is more normal 
than S 2 (formally: si -< S 2 ), and where Si is labelled with a set X of worlds 
(formally: l{si) = X), and 52 is labelled with a set Y of worlds (formally: 
^( 52 ) = K), this can be interpreted in the way that the set X is more normal 
in some respect than the set Y, or, if expressed in an intensional idiom: one 

II This is the terminology of Kraus et al.[85]; ‘states’ in this sense do of course not (neces- 
sarily) have something to do with the mental states of an agent. 
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property is more normal in some respect than another. In this case, satisfaction 
of normic conditionals is defined such that a[x] =>nor /?[^] is true if and only if 
the most normal properties among those that are contained in/entailed by the 
property of being a, are also contained in/entailed by the property of being /?. 

A more formal reason why such a generalized normal states semantics 
is to be introduced is that otherwise the completeness parts of the theorems 
104-106 would not hold. 

The following semantical systems are instances of a normality seman- 
tics for states, with an increasing set of constraints on orders and labellings; 
W is again the given set of worlds: 

Definition 73 (Cumulative Model Semantics for Normic Conditionals) 

1. A cumulative model is a triple (5, /, ^) with 

(a) a non-empty set S of states 

(b) a labelling I : S p{W) \ {0} of states 

(c) a normality ‘‘order”, or preference relation, -< C SxS between states 
(if Si -< S 2 , we say that si is more normal than S 2 ); 

but -< is not demanded to satisfy any formal constraints, in partic- 
ular, it is not necessarily a strict order relation 

(d) s.t., 971^ satisfies the Smoothness Condition (see below) 

2. factual formulas a E C are made true by states s E S in the following 
way: 

5 ^ a ijf^w E l{s): w \= a 

(in such a case we also say that s is an a-state) 

3. for every a E C let d = {s E S \s ^ a} 

4 . for every a E C: s E d is minimal in d iff ~^3s' E d: s' ^ s 

5. the Smoothness Condition says that every state that makes a true is either 
itself most normal among the states which make a true, or there is a more 
normal state that makes a true and which is also most normal among the 
states that make a true; i.e.: 

Vo E C,^s E d: s is minimal in d or 3s' -< s, s.t. s' is minimal in d 

6. relative to a cumulative model 971^ — {^'> ^5 ~^) define: 

971^ a[x] ^nor p[x\ 
iff^s E S: if s is minimal in d, then s ^ /3 

(i.e.: the most normal states among those that make a true also make (3 
true, or: normal as are /3s) 
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7. let = {a{a;] ^nor /?N |9K” l=” a[x] =^„or P[x] }; 

is the normic conditional theory corresponding to 

8. a[x] =^nor P[x] is cumulatively valid iff 

for every cumulative model a[x] ^nor P[x] 

9. let C let he a cumulative model: 

iff 

for every a[x] ^nor P[x] G it holds that a[x] =>nor l3[x] 

let let Ol\x\ =^nor P[^] G 

we say that 

a[x] =^nor P[x] 

(KBz:^^^^ cumulatively entails a[x] =^nor P[x]) iff 

for every cumulative modelTVf: iffWf thenTVf a[x] ^nor 

(5[x\, 

Definition 74 (Cumulative- Ordered Model Semantics for Normic Condition- 
als) 

1. A cumulative-ordered model is a cumulative model {S,l, -<), s.t. 

-< is a strict partial order, i.e., irrefiexive and transitive 

2. relative to a cumulative-ordered model 9Jl^^ = (5, /, -<) we can define: 

^co ^co CeN P[x] 
iff^is G S: if s is minimal in a, then s ^ p 

3. let = {a[x] ^nor P[x] ^co =^nor P[x] }.* 

is the normic conditional theory corresponding to 

4^ ot[x] =^nor P[x] is cumulaUve- Ordered- Valid iff 

for every cumulative-ordered model a[x] =^nor P[x] 

5. let C let he a cumulative-ordered model: 

KBz^^^^ iff 

for every a[x] =^nor P[x] G it holds that OJl^o ^co ^nor P[x] 

6. let KBz:^^^^ C let oc\x\ ^nor P\x\ G 

we say that 

Ko =^nor P[x] 

(KBz^^^^ cumulative- ordered- entails a[x] ^nor P[x]) iff 

for every cumulative-ordered model VJtco^ 

ifm^, then 9JI-, N", o[x] P[x], 
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Definition 75 (Preferential Model Semantics for Normic Conditionals) 

1. A preferential model is a cumulative- ordered model {S,l, ~<), s.t. 

\/s G S: l{s) is a singleton, i.e., l{s) = { k ;} for some w E W 

2 . relative to a preferential model = {S,l, -<) we can define: 

Wtp Np a[x] =>nor f3[x] 
iff^s G S: if s is minimal in d, then s ^ P 

3. let = {a[x] /?N \W^ a[x\ ^nor P[x]}: 

is the normic conditional theory corresponding to 911^ 

4. a[x] =^nor P[x] is preferentially valid iff 

for every preferential model 971^; 971^ a[x] =^nor P[^] 

5. let C let 971^ he a preferential model: 

m; N” iff 

for every a[x] ^nor P[^] ^ it holds that 971^ a[x] =>nor /?N 

dt let , let ] ^nor P[^] ^ 

we say that 

=^nor P[x] 

preferentially entails a[x] =>nor P[^]) iff 
for every preferential model 971^; if dJlp then 97t^ ol[x] 

^ nor /^[^] • 

Definition 76 (Ranked Model Semantics for Normic Conditionals) 

1. A ranked model 971J! is a preferential model (5,/, -<), where 

for some k e N there is a surjective mapping rk : S — > {0, ... ,k}, s.t., 
for all Si,S2 G S: Si -< S2 iff rk{si) < rk{s2) 

(rk{s) is called the ^rank’ of s under rk) 

2. relative to a ranked model 971J? = (5, /, -<) we can define: 

971^ t=^ a[x] =^nor P[x] 
ifffis G S: if s is minimal in d, then 5^/3 

3. let = {a[x] ^nor p[x] a[x] ^nor P[x] }: 

TW^_(97l-) is the normic conditional theory corresponding to 971^? 
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4. a[x] =^nor l3[x] is rank-valid iff 

for every ranked model NJ? a[x] ^nor P[A 

5. let C let be a ranked model: 

N- iff 

for every a[x] ^nor P[^] ^ it holds that NJ! a[x] =^nor P[x] 

6. let C let a[x\ ^nor 0[x] e 

we say that 

N;? a[x] ^nor P[x] 
rank-entails a[x] ^nor f3[x]) iff 

for every ranked model 9Jt"; i/9Jl" N" then 9Jl” N” a[x\ =^nor 

P[x]. 

The definition of ranked models in Lehmann&Magidor[88] is actually 
more complex than ours, but our definition is equivalent for the case of a finite 
set W of worlds. Note that in the case of ranked models, one might actually 
presuppose that no world is the label of more than just one state, since the 
completeness part of theorem 107 would hold nevertheless. The same is true 
of the subsequent simple cumulative and simple preferential model semantics 
- but we follow the original definitions in [88] : 

Definition 77 (Simple Cumulative Model Semantics for Normic Condition- 
als) 

1. A simple cumulative model is a cumulative- ordered model (5,/,^), 

s.t. 

-< is the empty relation, i.e., -<= 0 

2. relative to a simple cumulative model = (S', /, ^) we can define: 

^”c ^nor P[x] 

iff\fs G S: if s is minimal in a, then s ^ P, 
i.e.y in the simple cumulative case: 

Vs G S: if s ^ a, then s ^ 

3. let = {a[x] ^nor P[x] q[x] ^nor P[x] }: 

is the normic conditional theory corresponding to 

4- a[x] ^nor P[x] i'S simple- cumulatively valid iff 

for every simple cumulative model SJtgc* ^nor P[^] 
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5. let C let he a simple cumulative model: 

OJt”, N”, iff 

for every a[x] ^nor ^ it holds that ^sc ^nor P[^] 

6. let C let a[x] =>nor P[x] G 

we say that 

KB^_ a[x] ^nor P[x] 
simple- cumulatively entails a[x] =^nor l3[x]) iff 

for every simple cumulative model^^^: ifTt^^ then^^^ 

Ol\x\ =^nor /^[^]* 

Definition 78 (Simple Preferential Model Semantics for Normic Condition- 
als) 

1. A simple preferential model is a preferential model {S,l, -<)j s.t. 

-< is the empty relation, i.e., -<= 0 

2. relative to a simple preferential model 3!Jlgp = {S,l, ~<) we can define: 

Kp «[^l ^nor f3[x] 

iff^s G S: if s is minimal in d, then s ^ P, 
i.e., in the simple preferential case: 

Vs G S: if s ^ a, then s ^ 

3. let = {a[x] (3[x] a[x] P[x] }: 

is the normic conditional theory corresponding to 

4 . (^[x] ^nor P[x] is simple-preferentially valid iff 

for every simple preferential model SUgp.' a[x] =^nor l3[x] 

5. let C let ^ simple preferential model: 

iff 

for every a[x] =>nor f3[x] € it holds that 9Jt"p 1=”^ a[a;] =>„or 0[x] 

hr let B — 'y j let ^nor ^\x\ G 

we say that 

KB^_ o[x] p[x] 

simple-preferentially entails a[x] =>nor P[x]) iff 
for every simple preferential model dJl'^p: 
ifWfp then 911^^ (3[x\. 
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The different systems of normality semantics are related with respect 
to their logical strength as follows: 

Theorem 79 ( Containment Relations for the Different Versions of Normality 
Semantics with respect to Entailment) 

Let C Ol\x\ =^nor ^ 

it holds: 

1. if a[x] ^nor P[x] then a[x] ^nor fd[x\ 

2. if a[x] =^nor P[x] then a[x] =>nor /?N 

3. o[x] [d[x\ iff a[x] ^nor (d[x] 

4. if a[x] ^nor /?N then a[x] =^nor /?N 

5. if a[x] -=>nor /?N then a[x] =>nor /?N 

6. if a[x] =>nor f3[x] then a[x] -=^nor (3[x\. 

Proof: 

1 and 2 are obvious by definition. 3 is proved by Lehmann&Magidor[88] , 
and entails 4- 5 and 6 are again direct consequences of our definitions from 
above. As Kraus et al. [85] and Lehmann&Magidor[88] show, each of the im- 
plications stated in the theorem are irreversible for nontrivial C and W. ■ 

Theorem 80 (Equivalence of the Different Versions of Normality Semantics 
with respect to Validity) 

Let a[x] ^nor /?N e 

the following claims are equivalent: 

1. a[x] ^nor P[x] is cumulatively valid 

2. a[x] =^nor P[^] is cumulative- ordered- valid 

3. a[x] =>nor P[x] is preferentially valid 
4> ct[x] ^nor fd[x] is rank-valid 

5. a[x] =^nor P[x] is simple- cumulatively valid 

6. a[x] =^nor (^[x] is simple-preferentially valid 

7. VP 1= a[x] — > p[x]. 
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Proof: 

The equivalences follow from the fact that a single state that is labelled 
with a single arbitrary world inW is a model according to each of the normality 
notions of a model that we have presented. Thus, if a[x] =>nor P[x] is .. . (- 
)valid, a[a] j3[a] is true in every world ofW. On the other hand, if a[a] 
/3[a] is true in every world of W, then by each semantics also a[x] ^nor P[x] 
is .. . (-) valid. ■ 



Theorem 81 ( Containment Relations for the Different Versions of Normality 
Semantics with respect to Theories) 



Let TH-_ 
it holds: 



C C= 



1. ifTH^^^, is a normic conditional theory corresponding to a simple pref- 
erential model 

then is a normic conditional theory corresponding to a ranked 

model dWf 

2. ifTTL^^^^ is a normic conditional theory corresponding to a ranked model 

then is a normic conditional theory corresponding to a prefer- 

ential model -DDlp 

3. ifTH^^^^ is a normic conditional theory corresponding to a preferential 
model 

then is a normic conditional theory corresponding to a cumulat- 

ive- ordered model 

f. ifTTL^^^^ is a normic conditional theory corresponding to a cumulative- 
ordered model 

then is a normic conditional theory corresponding to a cumula- 

tive model 

5. ifTH=^^^^ is a normic conditional theory corresponding to a simple pref- 
erential model Tl'^p, 

then is a normic conditional theory corresponding to a simple 

cumulative model 

6. if is a normic conditional theory corresponding to a simple cu- 

mulative model 

then is a normic conditional theory corresponding to a cumulat- 

ive-ordered model 9)1^' 
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Proof: 

The implications are immediate consequences of the model definitions 

above. ■ 

Obviously (the first part follows directly from theorem 80): 

Remark 82 

1. {q;[x] =^nor P[x] G \a[x] p[x] E TH-^{W)} is a subset of every 

kind of normic conditional theory 

2. simple preferential models = (5, /, -<) satisfy the same conditionals 

as the universal models = {w eW \3s e S: l{s) = do, apart 

from replacing ^nor by 




Chapter 10 

SYSTEMS OF DEDUCTIVE AND NONMONOTONIC LOGIC 



In this chapter we will summarize various systems which have been introduced 
by Kraus et al.[85]. Contrary to Kraus et al. we develop these systems in a 
strictly syntactical manner without making use of nonmonotonic consequence 
relations (see Makinson[99] and [100]). As a substitute for consequence re- 
lations we refer again to conditional theories, which are defined as sets of de- 
feasible conditionals closed under the rules of the system considered, while 
extending a given set TH-^ of universal conditionals; TH-^ Q is a de- 
ductively (i.e., classically) closed set of universal formulas. Most notions which 
are defined in this section will only be given relative to such a theory TH^. 
TH-^ may be thought of as being identical to the universal theory 
corresponding to the universal model = W for the given set W of worlds. 
The relativization to TH-^ has been introduced by Kraus et al.[85], but it is 
not used by Adams and others. Instead of stating that a |~ /?, we rather say 
state that a[x] => P[x] G where TH^ is a theory of conditionals. 

Some of the subsequent rules apply if certain conditions on universal 
conditionals and defeasible conditionals are satisfied; compare our account of 
nonmonotonic reasoning from strict and defeasible premises in part II. 

Let ^ be an arbitrary implication sign that is not necessarily intended 
to be nonmonotonic, although this is going to be the most relevant case: 

Definition 83 (Conditional Theories) 

1. A conditional C-theory extending TH^ is a set TH^ C with the 
property that for all a[a] ^ C it holds that a[x] => a[x] G TH^ (Reflex- 
ivity), 

and which is closed under the following rules: 



f„) I- qN ^ !3[x\,ru^ h !5[x\ ^ a[x],a[x] ^ 7 [x] 

(a) (LeftEqmv- 

alence) 

(b) 7[a:] (3[x] Weakening) 
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Cut) 

(d) a axtS (Cautious Monatonicity) 

We refer to the axiom scheme and the rules above as the system C (see 
[85], pp. 176-180). The rules are to he read in the following way: 

e.g., by Cut, if a[x] A/3[x] ^[x] G TTt^ and a[x] ^ /3[x] G TH^, then 

a[x] 7 [x] G TH^ 

2. a conditional C-theory TH^ is consistent ijfT ^ ± ^ TH^ 

3. a conditional CL-theory TTL^ extending TTi-^ is a conditional C-theory 
extending TTL^, which is closed under the following rule: 

ao[x]^ ai[x\,ai[x]^ a2[x],. . . ,aj_i[x\^ aj[x],aj[x\=^ aQ[x] 

ar[x]^ar'[x] 

(r, r' are arbitrary members of {0 , . . . ,j}) 

We refer to C-hLoop as the system CL (see [85], pp.l87) 

f. a conditional P-theory TH^ extending T7Y_, is a conditional CL-theory 
extending T , which is closed under the following rule: 

a[x] V P[x] ^ ^[x] 

We refer to CL-hOr as the system P (see [85], pp. 189-190; there it is also 
shown that Loop is actually redundant since derivable in P) 

5. a conditional R-theory TTt^ extending TTL-^ is a conditional P-theory 
extending TTi-^, which has the following (non-Horn'' ) property: 

if a[x] ^ 7 [x] G TH^, and a[x] ^ -^P[x] ^ TH^, then a[x] A /3[x] ^ 
7 [x] G TTL:=^ (Rational Monotonicity). 

We refer to P-hRational Monotonicity as the system R (see [88], pp.16- 
48). Rational Monotonicity might be called a ‘nonmonotonic second-order 
inference rule’ (compare Schurz[151[) 

6. a conditional CM-theory TH^ extending TTi-, is a conditional C-theory 
extending T1~L^, which is closed under the following rule: 

h ff[x| ^ 7[X| 

a[x\ => 7 [xJ 

We refer to C+ Monotonicity as the system CM (see [85], pp. 200-201) 



*For more on non-Horn conditions see Makinson[99], section 4.1. 
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7. a conditional M-theory T7i^ extending Tld-^ is a conditional C-theory 
extending which is closed under the following rule: 

— ^ n (Contraposition) 

We refer to C-h Contraposition as the system M (see [85], p.202). Actually, 
M is nothing but a system of classical logic 

(for all a[a], P[a], y[a], ao[a], ai[a], . . ., aj[a] G C). 

Remark 84 



• A conditional C-theory extending T7Y_^ is' also a conditional C- 

theory extending any theory TH'_^ C TH^ 

• it is easy to see that a conditional C-theory TH^ is consistent iffT7i=^ is 
non-trivial, i.e. TH,^ 7 ^ (use Right Weakening and Cautious Mono- 
tonicity) 

• if a conditional C-theory extending T7Y_> is consistent, then also 

T1~L^ is consistent, i.e. F _L where _L is the logical falsum (use 

Reflexivity and Right Weakening). 

Cumulativity, i.e., Cautious Cut and Cautious Monotonicity taken to- 
gether, has been suggested by Gabbay[54] as a valid closure property of plausi- 
ble reasoning, independently of any semantical treatment. The stronger system 
P has developed into a kind of standard system of nonmonotonic logic, since it 
has been proved sound and complete with respect to many different semantics 
of nonmonotonic logic (some of them have been collected in Gabbay et al.[55]; 
see also Gardenfors&Makinson[58], chapter 4.3 in Fuhrmann[53], Benferhat et 
al.[17], and, more recently, Benferhat et al.[18]). Psychological findings, though 
still on a very preliminary level, indicate that P incorporates some of the ra- 
tionality postulates governing human commonsense reasoning (see Da Silva 
et al.[36], [37], Pfeifer[121], Pfeifer&Kleiter[122j). Cumulative systems of non- 
monotonic logic are also strongly connected to Spohn’s[167] ordinal conditional 
functions (ranking functions). 



The following rules are (meta-) derivable from the systems introduced 

above: 



Lemma 85 (Kraus et al.[85], pp. 179-180) 

The following rules are easily derivable in C (i.e., if the premises of 
the following rules are members of a conditional C-theory TH^, then also their 
conclusions): 

a[x] => P[x],a[x] ^ 7 [x] 
a[x] => P[x] A 7[x] 



1 . 



(And) 
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2 . (E^„a,nce) 

/3[x]^7[x] ' 

„ a[x]^ i(3[x\ ^ ^[x]) ,a[x\^ (i\x\ „ ■ ±i. n 

3. — — ^ ~ a[x] ^ 7 [x] (Modus Ponens m the Consequent) 

a[x] V l3[x] => a[x], q;[x] ^[x] 

a[x] V p[x] ^ 'y[x] 

5. ~ ( SupTO- Clos sicttUty ) 

Lemma 86 (Kraus et al.[85], p.191) 

The following rules are derivable in P: 

a[x] A (3[x\ ^ -i[x] 

a[x] ^ m ^ 7N) 

a[x] A /3[x] =» 'yjx], a[x] A ^/3[x] => j[x] , . 

a[x] => 7 [x] ^ 

Lemma 87 (Kraus et al.[85], p.201) 

The following rules are derivable in CM: 

1. Loop 

a[x] ^ 'y[x\ ' 

a[x] ^ {(i[x\ 7 [x]) 

a[x] A I3[x] ^ j[x] ' ^ 

Lemma 88 (Kraus et al.[85], p.202) 

The following rules are derivable in M: 

1. Or 

2. Monotonicity. 

Lemma 89 Furthermore, the following is derivable in M: 

a[x\ (3[x\ 

T [a[x] P\x\) 



T ^ {a[x] -» /3[x]) 
a[x] — > (3[x\ 
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Definition 90 (Derivation in C/CL/P/CM/M) 

Let KB^ C C=^: 

1 . A C-derivation (rel to TH-^) of (f[x] => xp[x] from KB^ is a finite 
sequence 

{ai[x\ !3i[x\, . . . ,ak[x] => /3k[x]) 

where afc[a;] = <p[x\, (3k{x\ = V’N> for all i G {1, . . . , k} at least one 
of the following conditions is satisfied: 

• ai[x] ^ Pi[x] e KB^ 

• ai [x] ^ Pi [x] is an instance of Reflexivity 

• ai[x] ^ Pi[x] is the conclusion of one of the rules of C, s.t. the 
conditional premises of the very rule are among 

{q:i[x] ^ Pi[x], . . . ,ai-i[x] => Pi-i[x]}, and in the case of Left Log- 
ical Equivalence, Right Weakening, and Monotonicity, the derivabil- 
ity conditions concerning TH-^ are satisfied 

2. KB^ h™- ip[x] ^ V’N 

(^[x] ^ t/;[x] is C-derivable rel. to TH^ from KB^) 

iff there is a C-derivation of (f[x] => xp[x] rel. to TH^ from KB 

3. Ded%'^-{KB^) = {(/?[a:] V’N KB^ <f[x] V'N|} 

(the conditional C-closure of KB^ rel. to TH-^) 

4 . g:>[x] => xp[x] is C-provable (rel. to TH^) iff 

and analogously for the systems CL/P/CM/M. 

Remark 91 

1. As usual, it follows that 

(a) KB:^ C Dedy^-{KB^) 

(b) ifKB^ C KB'^ then Dedy^^{KB^) C Dedl'^- {KB'^) 

(c) Dedl'^-{Ded'P^-{KB^)) = Dedy^-{KB^) 

2. obviously, TH=^ is a conditional C-theory extending TH-^ iff 

Ded'y^-{TH^)=TH^. 

Since DedP^^ {Dedl'^- {KB^)) = Dedl'^^ {KB^), Dedl'^-{KB^) 
is a conditional C-theory extending T1~L^ for arbitrary KB^. In partic- 
ular, Ded^^~^ (0) (the set of formulas which are C-provable rel. to TH^) 
is a conditional C-theory extending TH-^ 
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3. Ded^^^{KB^) is the smallest conditional C-theory extending 
which contains KB^ 

4- if the implication sign ^ is replaced by the material implication sign 
we may write ^KB^ U TH^ h (f[x] 'iplx] \ instead of ^KB-^ 

(p[x] — > ^[x] since M is just classical logic. 

In the case of R, derivability has to be defined differently due to the 
presence of the Rational Monotonicity “rule” : 

Definition 92 ( ''Derivation’’ in R) 

Let KB^ C 

1. KB^ ^ (VN R-derivahle rel. to T7i-^ 

from KB^) iff (f[x] => G HiTH^ \ TH^ D KB, TH^ is a cond. 
R-theory extend. TH^} 

2. Dedl'^-(KB^) = {<^[a;] ^ i^[x] KB <^[x] =» V'N } 

(the conditional R-closure of KB^ rel. to TB^) 

3. Lp[x] ^l)[x] is R-provable (rel. to TH^) iff0 (p[x] => ^[x]. 

Ded^^~" satisfies the same closure conditions as stated above. 

It follows: 

Theorem 93 ( Containment Relations for the Different Systems with respect 
to Derivability) 

Let KB^ C C^, a[x] j3[x] G C^; let TH-, C be a deductively 
closed set of universal conditionals; 

it holds: 



1. 


ff 


KB^ 




a 


[x] 




!3[x\ 


then 


KB^ 


^CL 


a\ 


[x] => 


P[x 


2. 


ff 


KB^ 


i_Tn^ 

^CL 


a 


[a:] 




/?[x] 


then 


KB^ 


'“P 


a\ 


[x] 


(3[x[ 


3. 
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P ^ 


[x] 
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;a;] iff KB 
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> P[x] 
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if 


KB^ 


\TH-^ 
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[^] 




P{x] 


then 


KB^ 
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[x] 


P[x 


5. 


if 


KB^ 
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^CL 


a 
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=4> 


/3[x] 


then 


KB^ 


^CM 


a\ 


x] => 


0[x 


6. 


if 


KB^ 


^CM 


a 


[a;] 




/3[x] 


then 


KB^ 


\_rn-. 


a\ 


x] ^ 


P[x 



Proof: 

Each claim except for 3 is obvious. 3 is shown by Lehmann&Magidor[88] 
on pp. 24 f (though in semantical terms). ■ 
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Theorem 94 (Equivalence of the Different Systems with respect to Provabil- 
ity) 

Let a[x] => /3[x] G C^, let TH-, C be a deductively closed set of 
universal conditionals; 

the following claims are equivalent: 

1. a[x] => P[x] is C-provable (rel. to TH^) 

2. a[x] ^ /3[x] is CL-provable (rel to TH-,) 

3. a[x] ^ p[x] is P-provable (rel to TH-^) 

4^ a[x] => P[x] is R-provable (rel to TH^) 

5. a[x] (3[x] is CM-provable (rel to TH-^) 

6. a[x] ^ P[x] is M-provable (rel to TH-^) 

7. TH-^ h a[x] f3[x]. 

Proof: 

If the set of premises is empty, formulas are derivable in each of the 
systems C to M either by (i) Reflexivity, or by (ii) derivation from TH^, i.e., 
essentially, by the Supra- Classicality rule, or by (Hi) applying deductively valid 
rules to (i), (ii), or previous instances of (Hi); each of these proof steps for, 
say, a[x] /3[x], may also be applied to yield the classical proof of a universal 
conditional a[x] — > /3[x] that is contained in TH~^ by deductive closure. Con- 
versely, by Supra- Classicality again, if a[x] — > j3[x] € then a[x] ^ j3[x] 

is provable in each of the systems referred to. ■ 

Theorem 95 ( Containment Relations for the Different Systems with respect 
to Theories) 

Let C C^, let C be a deductively closed set of univer- 

sal conditionals; 

it holds: 

1. ifTH^ is a conditional M-theory extending TH-^, 

then TH^ is a conditional R-theory TH=^ extending TTL-^ 

2. ifTH=^ is a conditional R-theory TH^ extending TH-^, 
then 'T'H^ is a conditional P-theory TH^ extending TH^ 

3. ifTH^^ is a conditional P-theory TH=^ extending TTL-^, 

then is a conditional CL-theory T1~L=^ extending 
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ifTH=^ is a conditional CL-theory extending TH^, 

then is a conditional C-theory T1~L^ extending TH-^, 

5. ifTH^ is a conditional M-theory extending 

then TH=^ is a conditional CM-theory TH:^ extending TH^ 

6. ifTH^ is a conditional CM-theory TH^ extending TH-^, 
then TH^ is a conditional CL-theory TH^ extending TH^. 



Proof: 

The implications are immediate consequences of the definitions above. 




Chapter 11 

SOUNDNESS AND COMPLETENESS RESULTS 



Now we can relate the semantical systems of sections 9.1, 9.2 and 9.3 to the 
syntactical systems of chapter 10 by means of soundness and completeness 
theorems. We are going to present three kinds of such theorems: (“strong”) 
soundness and completeness concerning derivability on the syntactical side and 
entailment on the semantical side, (“weak”) soundness and completeness con- 
cerning provability on the syntactical side and validity on the semantical side, 
and ( “strong” ) soundness and completeness concerning conditional theories on 
the syntactical side and conditional theories that are associated with models 
on the semantical side. The completeness parts of the theorems of the latter 
kind correspond to the type of completeness theorem by which a consistent 
classical theory is shown to have a non-empty set of classical models; in the 
case of probability semantics, such a kind of theorem is of course only available 
if the probability semantics considered includes the notion of a model. 

In the following we say that W is the set of worlds satisfying TH-^ C 
or equivalently that TH-^ C is the set of universal formulas (uni- 
versally) satisfied by W if and only if W is the set of propositional variable 
settings ic, s.t. for all a[x] P[x] G TH^: w 1= a[a] f3[a\. 

The subsequent result for universal models is directly derivable from 
the soundness and completeness theorems for classical logic, although the result 
is stated in a kind of nonstandard way in order to guarantee a smooth transition 
to the soundness and completeness theorems for the non-classical semantics of 
section 9.2 and, in particular, of section 9.3: 

Theorem 96 (Soundness and Completeness of M with respect to Universal 
Model Semantics) 

Let KB-^ C C^, a[x] /3[x] G £_>; let TH-^ C he the set of 
universal formulas (universally) satisfied by the given set W of worlds: 

1. KB^ h™- a[x] ^ ^[x] iff 

U h o:[a;] l3{x] iff 

KB^ a[x] (3[x] 
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2. a[x] P[x] is M-provable (rel to TH-^) iff 
rU-. h a[x] (3[x\ iff 

a[x] — ^ /3[x] is universally valid 

3. TH!^ C is a consistent conditional M-theory extending ijf 

there is a universal model based on the set W of worlds satisfying 
s.t rn'^ = 

The following soundness and completeness theorems hold for the prob- 
ability semantics: 

Theorem 97 (Derivability Soundness and Completeness of P with respect to 
Probability Semantics; see Adamsfl], [2], [3], [6]) 

Let C a[x] ^hp P[x] G let TH^ C be the 

set of universal formulas satisfied by the given set W of worlds; 
then each of the claims in theorem 69 is equivalent to: 

ct[x] ^hp I3[x] 

By the last theorem and by the theorems 70 and 94 we have: 

Theorem 98 (Provability Soundness and Completeness of C/CL/P/R/CM/M 
with respect to Probability Semantics) 

Let a[x] =>hp P[x] G then each of the claims in theorem 69 is 

equivalent to: 

a[x] =>hp (3[x] is C/CL/P/R/CM/M-provable (rel. to TH^) 

(and this is the case if and only ifTH-^ h a[x] f3[x\). 

Theorem 99 (Theory Soundness and Completeness of R with respect to Se- 
quence Semantics) 

Let rU-^ C C^: 

C is a consistent conditional R-theory extending T1~L-^ 

iff 

there is a probabilistic sequence model based on the set W of 

worlds satisfying s.t. TH^y,^ = 

Proof: 

The direction from the right to the left may be extracted from Adams [4] 
and the much more detailed SchurzflfS] . The other direction is implicitly con- 
tained in 

Lehmann&MagidorfSS] , p.27. ■ 

Theorem 100 (Theory Soundness and Completeness of P with respect to Order- 
of- Magnitude Semantics) 

Let rU-. C 
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C is a consistent conditional P-theory extending TH-^ 

iff 

there is a set X of probabilistic order- of -magnitude models based on the 
set W of worlds satisfying TH^, s.t. — TH=^^^{X). 

Proof: 

The proof is quite easily composed of the results in Adams[6], pp.lSf- 
137, Kraus et al.[85], and Lehmann&Magidor[88] .* ■ 

Theorem 101 (Theory Soundness and Completeness of R with respect to In- 
finitesimaljj Semantics; see Lehmann&Magidor[88] , pp. 53-58) 

LetTH-^ C 

C is a consistent conditional R-theory extending 

iff 

there is an infinitesimalu probabilistic model based on the set 

W of worlds satisfying TTL^, s.t 

Kraus et al.[85] and Lehmann&Magidor[88] show the following sound- 
ness and completeness theorems for normality semantics (though stated in dif- 
ferent terms): 

Theorem 102 (Derivability Soundness and Completeness of C/CL/P/R/CM/ 
M with respect to Normality Semantics; see Kraus et al.[85], Lehmann&Magi- 
dor[88]) 

LetKB^^^^ C a[x] -^nor /?N ^ letTH-^ C £_ be the 

set of universal formulas satisfied by every world in the given set W of worlds; 
it holds: 

1. KB^ a[x] ^nor /3[x] iff t=" a[x] ^nor (i[x\ 

2. KB^ h™- q[x] (3[x] iff a[x] ^nor P[x] 

3. KB^ a[a:] ^ !3\x\ iff a[x] ^„or P[x] 

I KB^ ^ (3[x] (iffKB=^ h™- a[x] ^ (3[x\) iff 

a[x] =>nor I3[x\ 

5. KB^ oc[x] ^ !3[x\ iffKB^^^^ a[x\ =>nor fi[x] 

*It is also easy to see that the following holds: 

(x\x\ ^ 

call — the CEM rule (after a well-known and controversial axiom scheme 

Q:[x] ( d [ x \ ^ 

in conditional logic); 

T'Hz±^y^^ C C^y^^ is a consistent conditional P-h CEM- theory extending iff there is a 

probabilistic order-of-magnitude model based on the set W of worlds satisfying 

s.t. 
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6. KB^ h™- a[x] ^ !3[x] 1=^^ a[x] ^nor /3[x]- 

By the last theorem and by the theorems 80 and 94 we have: 

Theorem 103 (Provability Soundness and Completeness of C/CL/P/R/CM/ 
M with respect to Normality Semantics) 

Let a[x] =^hp l3[x] G then each of the claims in theorem 69 is 

equivalent to: 

a[x] =^hp P[x] is C/CL/P/R/CM/M-provable (rel to TH-^) 

(and this is the case if and only ifTH-, h a[x] — > P[x]). 

Theorem 104 (Theory Soundness and Completeness for C with respect to 
Normality Semantics; see Kraus et al[85], pp. 184-185) 

Letrn^CC-.: 

C is a consistent conditional C-theory extending TTL^ 

iff 

there is a cumulative model based on the set W of worlds satisfying 

Theorem 105 (Theory Soundness and Completeness for CL with respect to 
Normality Semantics; see Kraus et al[85], p.l89) 

Letrn^CC^: 

C is a consistent conditional CL-theory extending 

rn^ iff 

there is a cumulative- ordered model based on the set W of worlds 
satisfying TU^, s.t. 

Theorem 106 (Theory Soundness and Completeness for P with respect to 
Normality Semantics; see Kraus et al.[85], p.l96) 

Let rn^ C C-.: 

C is a consistent conditional P-theory extending TTL-^ 

iff 

there is a preferential model based on the set W of worlds satisfying 
TH^, s.t. 

Theorem 107 (Theory Soundness and Completeness for R with respect to 
Normality Semantics; see Lehmann&Magidor[88] , pp. 21-23) 

LetTH^ (Z C-^: 

TTL^ C C=^ is a consistent conditional R-theory extending TTL^ 

iff 

there is a ranked model dJVf based on the set W of worlds satisfying 
TH^, s.t. 




Soundness and Completeness Results 



201 



Theorem 108 (Theory Soundness and Completeness for CM with respect to 
Normality Semantics; see Kraus et al[85], p.201) 

LetrU-. C 

C is a consistent conditional CM-theory extending 

TH^ iff 

there is a simple cumulative model based on the set W of worlds 
satisfying TH-,, s.t 

Theorem 109 (Theory Soundness and Completeness for M with respect to 
Normality Semantics; see Kraus et al.[85], p,203) 

Let TH-. C C^ : 

C is a consistent conditional M-theory extending TH^ 

^ff 

there is a simple preferential model based on the set W of worlds 
satisfying TH^, s.t. 

Remark 110 The consistency constraint on in the theorems above 

seems to have been overlooked by Kraus et al.[85] and Lehmann&Magidor[88] . 

Note that since C has in our case only finitely many propositional vari- 
ables, a normal states model dJfC based on C has only finitely many worlds as 
components. However, may nevertheless include an infinite set of states, 
but, as may be seen from the proofs of the completeness theorems above, the- 
orems 102-109 still hold if we add the property of finiteness of the set of states 
to the right sides of the ‘iff’ clauses above, as long as C is logically finite. 

Logical entailment in P (R) may be strenghtened into various direc- 
tions, among which there are: (i) strengthening in terms of truth preservation 
not in every preferential or rational model that is based on the given set of 
worlds but rather in certain particularly relevant ones (see the notion of rational 
closure in Lehmann&Magidor[88], the system Z in Pearl[119], and the related 
probabilistic version of Bamber[16]; see also the maximum entropy approach 
in Pearl[118]); (ii) strengthening in terms of irrelevance assumptions that are 
expressed in the object language (see Geffner&Pearl[60], and Schurz[147]). The 
entailment relations of the thus defined semantical systems are no longer mono- 
tonic with respect to the addition of conditionals within the set of premises. If 
we had made use of a notion of highly (not absolutely) second-order reliability 
for reasoning processes, we would have defined such a notion by one of the 
strenghtened entailment relation referred to in (i) and (ii). For simplicity, we 
neglect these (and other) types of strengthening. 




Chapter 12 

FURTHER CONSEQUENCES FOR JUSTIFIED INFERENCE 



The models and that we have referred to in section 7.1 of part 

II as models that satisfy universal conditionals, high probability conditionals, 
and normic conditionals, respectively, can now be characterized more clearly: 
is a universal model for the set Wact of propositional variable set- 
tings associated with the actual domain Dact of our cognitive agent’s sandbox 
area relative to the intended interpretation 3acu ^act presupposed to 

be one of the probabilistic models defined in section 9.2 for the actual proba- 
bility measure Probact on p(Wact); particularly, the infinitesimal// semantics 
seems to match our pre-theoretical intuitions concerning the vague notion of 
“high” probability, while the noninfinitesimal semantics highlights the practi- 
cal dimension of the high probability. may be assumed to be one of the 

normality models defined in section 9.3 for the actual normality order ~<act of 
(states labelled by sets of) the worlds of Wacu f^e preferential and the ranked 
model semantics are perhaps the most adequate ones. 

We have furthermore considered a model 9Jlact in section 7.1 and the 
subsequent sections of part II that has been regarded as the “superposition” 
of ^act^ ^act^ ^act perhaps of other components. dJlact might now 
be defined as a tuple (371“^, 971^* , • • •). and 9Jlact satisfies universal, high 

probability, or normic conditionals, if its corresponding model components do. 

This specification of the semantical notions by which the different kinds 
of qualitative reliability for inferences have been defined in section 8.2, together 
with the soundness results for conditional theories stated in section 11, and 
corollary 56 of section 8.5, entails certain closure properties of justified mono- 
tonic and justified nonmonotonic inference. E.g., the following property holds 
for both justified monotonic and justified nonmonotonic inferences: if A draws 
at some time a justified inference from a[a] to f3[a], at some (perhaps different) 
time a justified inference from a[a] to 7 [a], and if A furthermore draws at some 
time a basic inference from a[a] A f3[a] to 7 [a], then this latter inference is also 
justified. 
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More generally, we have: 

Corollary 111 (Closure Properties for Justified Monotonic Inferences) 

Let (^(^))te/n ^ trajectory of A, let t, t' , t" be arbitrary points of 

time being members of In; 

let THact = 

1. (Reflexivity) 

if A draws from t — Ate to t rel. to {s{t))^^j^ the basic direct monotonic 
inference from a[a] to a[a], then: 

{t ^tc^ . . . , ^ ^inf o[ci]) 

2. (Left Equivalence) 

if THact I- ot[x] -> 0[x], THact I- p[x] a[x], 

{t - nAtc , . . . , t) , (s{t))^^j^ N J{a[a] -^inf tW). ond 

if A draws from t' — Ate to t' rel to {s{t))^^j^ the basic direct monotonic 
inference from /?[a] to 7 [a], then: 

{t' - Ate, . . . . {^if))teln tH) 

3. (Right Weakening) 

if THact h a[x] p[x], {t - nAte, . . . ^ -^inf «[«]). 

and 

if A draws from t' — Ate to t' rel. to {s{t))^^j^ the basic direct monotonic 
inf erence from 'y[a\ to P[a], then: 

(f - Afc, (s(t))te/„ N J( 7 [a] (3[a]) 

4 . (Cautious Cut) 

if (t - nAtc, . . . ,t) , 1= J{a[a] A/3[a] 7 [a]), 

{t' - n'Atc, (s(0)te/n ^ J{oc[a] /?[a]), and 

if A draws from t" — Ate to t" rel. to {s{t))^^j^ the basic direct monotonic 
inference from a[a] to 7 [a], then: 

{t" - Ate, • • ■,t”) , (5(i))te/„ J{a[a\ 7 ( 0 ]) 

5. (Cautious Monotonicity) 

if {t - nAtc,...,t ) , 1= J{a[a] -^i^f /3[a\), 

{f - n'Atc, (s(i))tg/„ 1= J{a[a] ^i„f 7 [a]), and 

if A draws from t" — Ate to t" rel. to {s{t))^^j^ the basic direct monotonic 
inference from a[a] A P[a] to 7 [a], then: 

{t" - Ate, ■■■, t") , 1= J{a[a\ A fi[a] -^i^f 7 H) 
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6. (Contraposition) 

if {t - nAtc, . . . , (^(^))tG/n ^ J{oi[a] -^inf P[a\), and 

if A draws from t' — Ate to t' rel. to {s{t))^^j^ the basic direct monotonic 
inference from -</3[a] to ->a[a], then: 

{t' — Ate, • • • 5 {^{^))teln ^ ^inf 

By the lemmata stated in chapter 10, this entails analogous closure 
properties of basic direct justified monotonic inference with respect to Or, 
Monotonicity, and Transitivity. 

For basic direct nonmonotonic inferences we have: 

Corollary 112 ([Minimal] Closure Properties for Justified Nonmonotonic In- 
ferences) 

Let {s{t))^^j^ be a trajectory of A, let t, t' , t" be arbitrary points of 
time being members of In; 

let THact = Tn{mi,t): 

1. (Reflexivity) 

if A draws from t— Ate tot rel. to {s{t))^^j^ the basic direct nonmonotonic 
inference from a[a] to a[a], then: 

{t - Ate, (s(t))f^j„ 1= J{a[a] =^i„/ a[a]) 

2. (Left Equivalence) 

if'THact I- ot[x] p[x], THact 1- p[x] a[x], 

{t - nAte , . . . ,t) , {s{t))t^j^ N J{a[a] =^inf ^[a]), and 

if A draws from t' — Ate to t' rel. to {s(t))^^j^ the basic direct nonmono- 
tonic inference from P[a] to 7 [a], then: 

{t' ~ Ate, • • • ? O 5 i^i^))teln ^ ^inf TW) 

3. (Right Weakening) 

if 'THact Ol[x] -> (5[x\, {t - nAte, . • . ,t) , (s(t))^e/n ^ ‘^(7N =^inf Oi[o\) , 
and 

if A draws from t' — Ate to t' rel. to {s{t))^^j^ the basic direct nonmono- 
tonic inference from 7 [a] to P[a], then: 

{t' - Ate, {s{t))teln ^ =^inf /?[«]) 

4 . (Cautious Cut) 

if {t - nAte, ■■.,t), (s(t))(g/„ 1= J(a[o] A f3[a] ^inf 7[a]), 

{f - n'Ate, ■■■,t') , (s(0)te/n ^ -^(a[a] =^m/ /?[«])> o-nd 
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if A draws from t" — Ate to t" rel. to {s{t))^^j^ the basic direct nonmono- 
tonic inference from a[a] to 7 [a], then: 

{t — Ate, . . . ,t ) , J{o'[a] ^inf tM) 

5 . (Cautious Monotonicity) 

if {t - nAte , . . ^ ^ =^inf p[a\), 

{t' - n'Ate , . . . , t') , (^( 0 )tG/n ^ J{o^[a] ^inf l[o\) , and 

if A draws from t" — Ate to t" rel to {s{t))^^j^ the basic direct nonmono- 
tonic inference from a[a] A P[a] to j[a], then: 

{t" - Ate,...,t") ,{s{t))^^j^ ^ J(a[a] A/ 3 [a] =>inf j[a]). 

Corollary 112 may be strengthened if high reliability is restricted to 
high probabilistic reliability, or to high normic reliability based on normality 
model ^act is cumulative-ordered or stronger. 

It is also easy to see that corollary 112 entails by means of Cautious 
Monotonicity and Left Equivalence a kind of “specifity sensitiveness” of justi- 
fied nonmonotonic inferences (this also holds for justified monotonic inferences, 
but there it is a trivial fact): 

Corollary 113 (Specifity Sensitiveness of Justified Nonmonotonic Inference) 
Let {s{t))^^j^ be a trajectory of A, let t be an arbitrary members of In; 
let THact = ^^t be either =^hp or =^nor- 

if^act t= ^ j[x], 

^act ^ / 3 [x]^ 

^act f= P[x] a[x], 

and if A draws from t — Ate to t rel. to {s{t))^^j^ the basic direct 
nonmonotonic inference from a[a] A P[a] to -^'y[a\, then: 

{t ~ Ate, . . . ,t) , (<^(^))^^/^ ^ J(o[a] A P[a] =^inf ~~'j[o])^ 

E.g.: since Bird{x) =>nor CanFly{x), Penguin{x) ^nor ~^CanFly{x), 
Penguin{x) =^nor Bird{x) are true, if A infers from Bird{a) A Penguin(a) to 
^CanFly(a)^ then A does justifiedly so, since penguins are normally birds*, 
the belief that [Penguin(a) is true] is thus more specific than the belief that 
[Bird{a) is true], and ^’s true general belief that [Penguin{x) => -^CanFly(x)] 
should therefore be dominant over A’s true general belief that [Bird{x) ^ 
CanFly{x)] in the context of this situation. 

This kind of specifity sensitiveness is again related to Carnap’s re- 
quirement of total evidence and Hempel’s rule of maximal specifity, and it is a 
well-known facet of cumulative nonmonotonic reasoning. 

*In this example case it is even true that all penguins are birds. 




Further Consequences for Justified Inference 



207 



Finally, by theorems 96, 97, and 102, we can specify def.52 of absolute 
second-order reliability for reasoning from either only universal, or only high 
probability, or only normic conditionals, to a conditional of a corresponding 
type in the following way: 

Corollary 114 (Absolute S econd- Order Reliability for =>hp, ^nor) 

1. It is absolutely reliable to reason from ao[x] — > Po[x], ^ ctn[x] — > j3n[x] 
to a[x] P[x] iff 

{aoN ^ /?oN, . . . , a„[x] ^ 0n[x]} l=“ a[x] /3[x] 

( where is defined relative to set of all possible worlds for C) iff 

{qoN /3oN, ■ • ■ , oin[x] fin[x\} N a[x] 0[x] iff 
{ao[x] /?o[x], . . . ,a„[x] ^ fin[x]} a[x] -> /3[x] 

(where TH-^ is the set of classically valid formulas of C) iff 
{ao[x] Po[x],.-., a„[x] -> /3„[x]} h a[x] /3[x] 

2. it is absolutely reliable to reason from o;o[x] =>kp /?oM, • • • ,a„[x] =>hp 
Pn[x] 

to a[x] ^hp fi[x] iff 

{qoN ^hp /?oN, • . • , a„[x] ^ hp f^n [x]} Ot[x] ^hp /^[^] 

(substitute for ^. . .’ any index for a high probability semantics, where 
is defined relative to set of all possible worlds for C) iff 

{ao[x] l3o[x],..., Q„[x] ^hp (3n[x]} l-p^^ a[x] ^hp fi[x] 

( where T is the set of classically valid formulas of C) 

3. if it is absolutely reliable to reason from ao[x] =>nor • • • ? <^n[^] ^nor 

Pn[x] 

to a[x] =^nor ^hcn 

nor • • • ? ^n[^] nor /^n[^]} nor (^\p^\ 

(where is defined relative to set of all possible worlds for C), and 

nor • • • 5 ^n[^] ^^nor /^n[^]} o[x] nor /^[^] 

(where is the set of classically valid formulas of C). 

Item 3 of Corollary 114 may be strengthened if high normic reliability 
is based on cumulative-ordered or stronger (e.g., preferential) models. 
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Chapter 13 

INTRODUCTORY REMARKS 



By means of the semantical and proof-theoretical analysis of universal, high 
probability, and normic conditionals that we have given in part III, we have 
gained further insight into the (first- and second-order) notions of absolute 
and high reliability, and by that also into the justification of monotonic and 
nonmonotonic inferences. In this fourth part we are going to make use of the 
results of part III, together with the concepts and results of parts I and II, when 
we deal with the following questions: (i) is there a low-level agent which is ideal 
in the sense of section 8.6? (ii) If yes: what might the cognitive architecture 
of such an ideal agent look like? In particular: how may the typical properties 
of justified nonmonotonic inferences be implemented, i.e., the nonmonotonicity 
effect (that holds for nonmonotonic inferences in general) on the one hand, and 
the “optimum instability” , the closure properties, and the specifity sensitiveness 
(of justified nonmonotonic inferences) on the other hand? 

The answer to the first question turns out to be affirmative; this is 
proved by giving one possible response to the second question: an inferentially 
ideal cognitive agent may be constructed which has a cognitive architecture 
that is based on a simple qualitative neural network where nonmonotonicity is 
implemented by inhibition mechanisms. 

In parts I-III we have not been explicit on what the cognitive architec- 
ture of our agent A is like, since this has not been relevant for the explication, 
justification, or the logic of inference. In this part, however, we are going to 
study a more particular class of cognitive agents, but where each member of the 
class has the properties assumed for A in the first three parts. In order to spec- 
ify these agents in more detail, we have to determine (a) in which way they 
represent objects (properties, propositions, etc.), and (b) in which way they 
process these representations.* As far as our cognitive agent A is concerned, 

* According to the “intelligence without representation” view of Brooks [29] and also of 
a small subgroup of connect ionists, representation is by no means a necessary prerequisite 
for intelligent cognition. But we take it for granted that every intelligent agent beyond some 
minimal degree of complexity has to employ representations; e.g., to the best of our empirical 
knowledge today, the cat in our cat&bird example uses representations in order to draw 
inferences. 
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belief states are the primary examples of representational states: we have char- 
acterized belief states in part I as states having propositional contents, i.e., 
the belief that [ip is true] is a state in which the proposition expressed by ip is 
represented in a certain way. The cognitive processes which we are primarily 
interested in are of course inferences: inferences create belief states due to the 
presence of other belief states. In order to determine the cognitive architecture 
of A within our context, it is therefore sufficient to specify in more detail (a) the 
belief states of A and the way in which they represent propositional contents, 
and (b) how ^’s inference processes are implemented. On the other hand, we 
will later simply assume that ^’s belief states indeed represent propositional 
contents in this specified way and we will not say - except for the inferential 
activities that we are going to deal with extensively - what has to be the case 
such that A^s belief states really become manifest appropriately in terms of 
the internal or external activities that we have referred to in chapter 3 (such 
questions are among the topics of psychosemantics; see e.g. Fodor[48j). 

13.1 Two Paradigms of Cognitive Science 

Today’s cognitive science distinguishes between two main paradigms of cogni- 
tive architectures: the symbolic computation paradigm (rules- and-representa- 
tions model, classicism), and the dynamical systems paradigm (with connec- 
tionism as its most prominent special case). We will briefly outline the ways 
in which the two paradigms are usually discriminated, but we cannot present 
a definite way of doing so. We will see in the subsequent chapters that there 
is a class of symbolic computation agents, and a class of network-like agents, 
s.t. for every agent in either of the two classes there is an agent in the other 
that is inferentially ‘'equivalent'', i.e., where both agents are disposed to draw 
the same set of inferences. There are, however, practical differences: e.g., the 
symbolic computation agents will be seen to have a much more complex cog- 
nitive architecture and to be not equally effective. We hypothesize that this 
is due to the “mismatch” of a high-level architecture on the one hand, and a 
low-level task on the other. But if we searched for the implementation of in- 
ferences by means of rational argumentation procedures, it would presumably 
be the connect ionist machines that failed for practical reasons now, due to the 
“mismatch” of a low-level architecture and a high-level task. 

13.1.1 The Symbolic Computation Paradigm 

According to the symbolic computation paradigm, (i) intelligent cognition de- 
mands structurally complex mental representations, s.t. (ii) cognitive process- 
ing is only sensitive to the form of these representations, (iii) cognitive process- 
ing conforms to rules, statable over the representations themselves and articu- 
lable in the format of a computer program, (iv) many mental representations 
have syntactic structure with a compositional semantics, and (iv) cognitive 
transitions conform to a computable cognitive-transition function (we adopt 




Two Paradigms of Cognitive Science 



213 



this characterization essentially from Horgan&Tienson[82], with only slight de- 
viations). The symbolic computation paradigm has been the “driving spirit” 
behind artificial intelligence, cognitive psychology, and, more generally, cog- 
nitive science, since the 1950s. “Physical symbol systems” (see Newell[114]), 
or a “language of thought” (Fodor[47]), have been argued to be a necessary 
prerequisite for intelligent cognition, since intelligent cognition is supposed to 
be “systematic” and “productive” (see Fodor&Pylyshyn[49]), i.e., the represen- 
tational capacities of intelligent agents are supposed to be necessarily closed 
under various representation-transforming and representation-generating oper- 
ations (e.g., if an agent is able to represent that aRb^ she is also able to represent 
that bRa, etc.). This capacity is hypothesized to be due to the combinatorial 
properties of languages of mental symbols based on a recursive grammar. A 
cognitive agent that conforms to the symbolic computation paradigm has the 
belief that is true] if and only if a sentence ip' of the agent’s internal lan- 
guage expresses the same proposition as p and is stored in the agent’s symbolic 
“knowledge base”. As far as cognitive processes are concerned, the paragon 
for the symbolic computation paradigm has been the mathematician or the 
logician that presents a proof on a blackboard: rules of inference govern his 
writing down symbolic expressions, or at least they seem to. The computa- 
tions of a Turing machine are regarded as the abstract counterparts of such 
activities, and therefore recursion or computability theory can be regarded as 
the mathematical background theory for this approach. Our usual computers 
may be considered as finite-state machines that are finitary approximations 
to the “ideal limit” of Turing machines with infinite memory capacities. The 
rules that govern cognitive processes according to the symbolic computation 
paradigm are either represented within the cognitive agent as symbolic entities 
themselves, or they are “hard-wired” ^ . Inference processes are hypothesized to 
be internalizations of derivation steps within some logical system, and the al- 
leged “systematicity” of inferences (see Fodor&Pylyshyn[49j) is explained by 
the internal representation or hard-wiring of rules that are only sensitive to 
the form of sentential representations. Johnson-Laird[83], Rips[134], Braine[23], 
0’Brien[117] have developed this “mental rules” model of human inference into 
a psychological theory of human reasoning. 

13.1.2 The Dynamical Systems Paradigm 

The dynamical systems paradigm of cognitive science may be summarized by 
what van Gelder[173] calls the “dynamical hypothesis”: “for every kind of cog- 
nitive performance exhibited by a natural cognitive agent, there is some quan- 
titative [dynamical] system instantiated by the agent at the highest relevant 
level of causal organization [i.e., at the level of representations], so that per- 
formances of that kind are behaviors of that system” (van Gelder[173], p.622; 
see also van Gelder[175]). The dynamical systems tradition of cognitive science 

^It is clear that not all rules can be represented in a symbolic memory, because at least 
one rule is needed by which the represented rules are read and executed. 
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in the modern era reaches back to the early days of cybernetics and systems 
theory (see e.g. Ashby[ll]). A dynamical system may be regarded as a pair 
of a state space and a set of trajectories, s.t. each point of the space corre- 
sponds to a total cognitive state of the system, and every point of the space 
lies precisely on one trajectory. If a certain point corresponds to the system’s 
total cognitive state at t, the further evolution of the system follows the tra- 
jectory emanating at this point. Usually, such systems are either defined by 
differential equations, or by difference equations, defined over the points of the 
state space: in the first case one speaks of continuous dynamical systems with 
continuous time, while in the latter case one speaks of discrete dynamical sys- 
tems with discrete time. In the discrete case, the set of trajectories may be 
replaced by a state-transition mapping, s.t. each trajectory is generated by the 
iterated application of the mapping. The mathematical background theory for 
the dynamical system paradigm is of course the theory of dynamical systems^ 
that includes the traditional study of differential equations and systems theory. 
But dynamical systems may also be studied from an extended computational 
perspective (see Blum et al.[20]). A cognitive dynamical system is a dymam- 
ical system with representations, where points, states - in particular, stable 
states - attractors, etc., can be ascribed a content, i.e., they are interpreted. 
Our solar system is, e.g., a dynamical system, but not a cognitive dynamical 
system. Human or animal brains are cognitive dynamical systems. Cognitive 
processes, like inferences, take place in cognitive dynamical systems by certain 
(total) state-transitions of the system, but not every state-transition is nec- 
essarily also cognitive. The class of cognitive dynamical systems in the sense 
elaborated now, includes all Turing machines and also all finite-state machines 
with representational states, since such automata are simply special kinds of 
discrete dynamical systems. Therefore, up to now, we have only taken a more 
general viewpoint on cognition than the symbolic computation point of view 
has offered to us. However, the dynamical hypothesis quoted above is supposed 
to be more specific than that: it assumes that intelligent cognition is subserved 
by state-transitions in quantitative systems, i.e., systems in which a metrical 
structure is associated with the points of the state space and of course such 
structure is also associated with time, s.t. the dynamics of the system is system- 
atically related to the distances measured by the metric function. The distances 
between points may be regarded as a measure of their similarity qua being total 
cognitive states. Moreover, the dynamical systems that are typically focused on 
in the dynamical systems paradigm also have a vector space structure, and thus 
they “support a geometric perspective on system behaviour” (van Gelder[173], 
p.619). But although it is obvious that not every state transition mapping of 
every dynamical system is computable by a Turing machine, it is not clear at 
all that Turing machines or finite state machines are excluded by reference to 
“quantitative” properties of a system, since the programs of Turing machines 

^ Chaos theory is a branch of the modern theory of dynamical systems. 
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and the state-transitions mappings of finite-state machines might be translated 
into difference equations. In particular, if the program or the state-transition 
mapping has not been represented symbolically itself, the distinction between a 
rvle-following symbolic computation machine based on a hard-wired program, 
and a rule-(i.e., equation)- governed dynamical system starts to collapse. Finite 
state machines may not only be regarded as finite approximations to Turing 
machines, but also to dynamical systems in general Thus, if there is an essential 
difference between the cognitive architectures suggested by the symbolic com- 
putation paradigm on the one hand and the by dynamical systems paradigm 
on the other, it seems to lie on the representational level - on the level of how 
such systems represent. 

13.1.3 Connectionism as a Special Case of the Dynamical Systems Paradigm 

Connectionism is the most well-known brand of the dynamical systems para- 
digm: the dynamical systems that are in the limelight of connect ionists are the 
so-called artificial neural networks, i.e., abstractions from the complex brain 
“wetware” in terms of systems of primitive units and connections, where ac- 
tivation is propagated through the network in accordance to simple and local 
quantitative propagation functions. Although the basic ideas of connectionism 
have been available already since the 1940s (see McCulloch[107]), it was only 
in the 1980s that connectionism became broadly infiuential (see Rumelhart et 
al.[138] for the “locus classicus” textbook). Smolensky [162] characterizes (one, 
particularly infiuential, version of) connectionism by the following hypotheses: 
(i) “The connectionist dynamical system hypothesis: The state of the intuitive 
processor [i.e., the part of an intelligent cognitive agent that is responsible for 
the kind of cognitive activity that does not consist in conscious rule application] 
at any moment is precisely defined by a vector of numerical values (one for each 
unit). The dynamics of the intuitive processor are goverened by a differential 
equation. The numerical parameters in this equation constitute the processor’s 
program or knowledge. In learning systems, these parameters change according 
to another differential equation.” (ii) “The subconceptual unit hypothesis: The 
entities in the intuitive processor with the semantics of conscious concepts of 
the task domain are complex patterns of activity over many units. Each unit 
participates in many such patterns.” (iii) “The subconceptual level hypothesis: 
Complete, formal, and precise descriptions of the intuitive processor are gener- 
ally tractable not at the conceptual level, but only at the subconceptual level.” 
The subconceptual level is the level of analysis that is preferred by the connec- 
tionist paradigm, or, as Smolensky expresses it, the subsymbolic paradigm; it 
lies “below” the conceptual level which is preferred by the symbolic computa- 
tion paradigm, but “above” the neural level preferred by neuroscience. In the 
classical McCulloch&Pitts[106] model, every single node in a neural network 
represents a single proposition: this “one unit/one item” form of representa- 
tion is nowadays called ‘local’. In associative networks, and in the so-called 
semantical networks (see e.g. Bibel[19], p.33 and p.93), both the nodes and 
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the connections between nodes are used as localized representations, s.t. nodes 
represent words or meanings, while the connections represent associations; in 
inheritance networks (see e.g. Gabbay et al.[55], chapter 3) the nodes and the 
connections are again used as localized representations, s.t. nodes represent 
properties or objects, while the connections represent containment or mem- 
bership relations. But in today’s connectionist approaches a distributed form of 
representation in networks is preferred by which patterns of activity distributed 
over an ensemble of units, and patterns of (weights of) connections are the rep- 
resenting entities. In this way the states of single nodes or of single connections 
are not symbols themselves but rather subsymhoh (this is Smolensky’s [162] 
term), i.e., constituents of the symbols that are used in the symbolic computa- 
tion paradigm. Symbols are now identified with sets of such subsymbols, and 
the symbols are not manipulated by symbolic operations directly, but their 
subsymbolic constituents rather participate in much more fine-grained numeri- 
cal computations. This is not meant to entail that every representation within 
a connectionist dynamical system has to be distributed according to the sub- 
conceptual unit hypothesis, but just that most of the representations have to 
be distributed if intelligent cognition should be carried out by networks. This 
is argued for by connectionists in the following way: (i) for complexity con- 
siderations, it is practically impossible to represent a great number of entities 
locally, and (ii) the learning mechanisms, which are so successfully applied in 
artificial neural networks, need distributed representations in order to function 
appropriately, since only under distributed representation it is the case that the 
revision of one representation causes the revision ( “learning” ) of another repre- 
sentation as an automatic side effect, without any further necessary association 
devices (see Rumelhart et al., vol.l, chapter 3). Still, e.g., a single unit may 
represent a so-called “microfeature” , as it is very often the case in connectionist 
models; but in such a case the superpositions of such microfeature represen- 
tations are usually used as distributed representations. There is no general 
agreement on whether distributed representations differ substantially from the 
symbolic language-like representations of the symbolic computation paradigm, 
or whether they are simply symbolic representations themselves which arise 
from subsymbolic numerical computations. 

Now let us turn to cognitive processes in connectionist networks: in- 
ferences are, according to the connectionist paradigm, again (total) state tran- 
sitions in a network whereby an input pattern of activity that represents a 
certain belief content causes a further pattern of activity that also represents 
a belief content; the output pattern of such a process is often identified with 
a (in case of uniqueness: “the”) pattern of activity that is stable under the 
activity propagation initiated by the input. Inferences in networks differ from 
their symbolic computation counterparts in so far as no rule of inference is 
applied explicitly, and the rules that are perhaps represented by the total net- 
work topology and by the weights of the edges are not applied subsequently 
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but rather collectively, s.t. a single connection is just a “soft” constraint on the 
final conclusion. As Smolensky [162], p.l8, writes: “Formalizing knowledge in 
soft constraints rather than hard rules has important consequences. Hard con- 
straints have consequences singly; they are rules that can be applied separately 
and sequentially - the operation of each proceeding independently of whatever 
other rules may exist. But soft constraints have no implications singly; any one 
can be overridden by the others. It is only the entire set of soft constraints that 
has any implications. Inference must be a cooperative process [. . .] Further- 
more, adding additional soft constraints can repeal conclusions that were for- 
merly valid: Subsymbolic inference is fundamentally nonmonotonic.” Inferences 
within neural networks do not differ substantially from pattern recognition or 
pattern completion processes, and they are therefore also realizable by networks 
as quickly as the latter tasks. Indeed, as we will see below, a nonmonotonic in- 
ference might be regarded as a process of belief pattern completion. By means 
of learning mechanisms, the parallel distributed processing in artificial neural 
networks “redirects attention to environmental regularities that are statistical” 
(Smolensky [163], p.l84), s.t., “Certain sub-symbolic systems can be identified 
as using statistical inference” (Smolensky [162], p.72). 

13.1.4 Comparison 

There is still an intensive discussion on whether the symbolic computation 
paradigm and the dynamical systems paradigm mutually exclude each other. 
Smolensky [162] argues that both paradigms have their preferred level of analy- 
sis. But Smolensky [162], pp.6f, also presents the following hypothesis as a com- 
prehensive hypothesis underlying connect ionism: “The subsymbolic hypothe- 
sis: The intuitive processor is a subconceptual connectionist dynamical system 
that does not admit a complete, formal, and precise conceptual-level descrip- 
tion.” If this hypothesis is true, the paradigms are indeed mutually exclusive. 
Horgan&Tienson[81] point into a similar direction, when they argue that cog- 
nitive dynamical systems may only be described by ceteris paribus laws, which 
would thus be the appropriate laws for psychology. They regard such laws as 
the “natural expression of a defeasible causal tendency” (p.l09) in a cogni- 
tive dynamical system. Cognitive states are viewed as emitting forces which 
tend to activate other cognitive states, but only if there is no further active 
state with a stronger contrary impulse which defeats the first one. According 
to Horgan&Tienson, this is best described by ceteris paribus laws as opposed 
to the exceptionless “hard” laws that may be used as the rules of a symbolic 
computation machine. But this identification of “hard” laws with the rules 
applied within symbolic computation architectures is rash: the results of the 
subsequent chapters show that logically closed sets of defeasible laws may be 
used to describe the general beliefs and thus the inferences dispositions of a 
network agent both correctly and completely; but, of course, the logic has to 
be changed accordingly. 
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Perhaps, as Gardenfors[57], pp.67f, emphasizes, the two paradigms 
should simply be seen as complementing each other: “they are best viewed as 
two different perspectives that can be adopted when describing the activities 
of various computational devices.” New results concerning symbol manipula- 
tion in networks (see e.g. Chen&Honavar[33], Chalmers[31]) and rule extraction 
from networks (see e.g. d’ Avila Garcez et al.[13], or Holldobler[80]) show that 
there might be continuous paths of transition from the one paradigm to the 
other. Moreover, also hybrid systems consisting of symbolic and network com- 
ponents have been suggested (the so-called “integrated connectionist /symbolic 
cognitive architectures”: see Legendre et al.[87]). Horgan&Tienson[82] argue 
that intelligent agents have to be cognitive dynamical systems, hut with sym- 
bolic representations. 

But it is obvious that the prototypical symbolic computation imple- 
mentations of cognitive agent functions on the one hand, and the prototypical 
dynamical systems implementations on the other, differ concerning their mer- 
its and shortcomings: (i) a symbolic computation agent which uses a symbolic 
knowledge base is able to integrate a new belief by simply adding another sym- 
bolic representation to its knowledge base; in order to reach the same result 
a net agent may perhaps be forced to re- adjust the topology /weights for the 
whole network. But this mutual independence of symbolic representations is 
at the same time the very reason why learning is so much more successfully 
implemented by network agents with distributed representation; (ii) symbolic 
computation allows for extremely general architectures that work, e.g., in ev- 
ery possible environment: a derivation (production/expert) system is poten- 
tially capable of deriving formulas from an arbitrary symbolic knowledge base 
by means of generally valid rules of inference. This high degree of general- 
ity is usually paid for by high computational complexity and a resulting lack 
of speed and of practical applicability. On the other hand, if a net agent is 
equipped with or has settled (by learning) to a set of weights, she is quick and 
efficient with respect to her cognitive performances; (iii) a further advantage 
that a symbolic agent has concerning inferential activities is that she is able 
to unfold the assumptions on which her inferences have been based, and how 
they have been derived; a net agent can only tell us about the trajectory of 
states that have led to her conclusions, which is a completely different issue; 
as we have indicated in part II, this is also the reason why inferentially ideal 
agents in an internalist sense of the word are perhaps even necessarily symbolic 
computation agents; by contrast, (iv) symbolic computation agents tend to be 
not as robust as dynamical system agents, since a small alteration of one of 
the symbol representations may lead to drastic changes in the output if not to 
total misfunctioning; but neural networks usually show the property of smooth 
degradation. On the other hand, the more chaotic a dynamic system is, the 
more sensitive it is also with respect to slight internal variations: the nonmono- 
tonicity of high probability or of normic inferences might even be regarded as 
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entailing chaotic properties of dynamical system agents that are disposed to 
draw such inferences, since slight variations in the inference premises may lead 
to drastical changes in the inferential output; (v) symbolic computation agents 
have notorious difficulties with uncertain, vague, and incomplete knowledge, 
due to the all-or-nothing character of symbolic representations (often referred 
to as “brittleness”), whereas for artificial neural networks the latter properties 
describe the normal case of knowledge representation; (vi) there is still no neu- 
rophysiological evidence that would support the claim that natural agents like 
higher animals or human beings are symbolic computation agents, whereas ar- 
tificial neural networks are intended to be abstract accounts of neural circuits, 
though only crude and simplified ones; but, of course, also the neural plausibil- 
ity of artificial neural networks is far from being settled; (vii) conversely, the 
symbolic computation paradigm is so much closer to our introspective aware- 
ness of cognitive states and processes than the connectionist paradigm; (viii) 
up to now, the problem domains in which the symbolic computation architec- 
tures have been applied successfully, and the domains in which the dynamical 
system architectures have been applied successfully, have turned out to be mu- 
tually exclusive. If we put this in a nutshell: the more high-level, abstract, and 
- in the case of human agents - introspectively accessible a cognitive process 
is, the more easily it is implemented by means of symbolic computation. The 
more low-level, commonsense, and unconsciously proceeding, the more adver- 
sive a cognitive process is with respect to symbolic computation: in such a 
case, a cognitive dynamical system implementation should be looked for as an 
alternative.^ Therefore, whether inference is implemented more adequately by 
symbolic computation agents or by cognitive dynamical systems, depends on 
whether inference is determined to be a high-level or a low-level process (while 
being aware of the vagueness of such qualifications). 

According to the traditional view on (human) inference, inference is 
a complex high-level process that is comparable to a logician’s or mathemati- 
cian’s derivation of a theorem. Often this view is even taken to be entailed 
by our folk-psychological theory of belief and inference, and this alleged inter- 
dependence of symbolic computation and folk-psychological analysis is either 
interpreted as weighing heavily against folk-psychology in the eyes of some neu- 
rophilosophers (like the Churchlands; see e.g. [35]), or against connectionists in 
the eyes of the symbolic computationalists (see e.g. Ramsey et al. [129]). But 
our folk-psychologically inspired explication of the notion of inference in part 
I has not shown up a commitment to any kind of cognitive architecture. It 
might be argued that this is due to examples like the cat&bird example which 
we have taken to be prototypical instances of inferential activity, although the 
cognitive agents concerned are not human. But this would be a misunderstand- 



§This is also pointed out by Eliasmith[44]; but he suggests so-called “holographic reduced 
representations” as a form of representation in neural networks that is open for both low- 
and high-level cognition. 
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ing: inferences of the same type as in the cat&bird example are of course also 
drawn by human agents - such inferences simply belong to those cognitive pro- 
cesses that human agents share with many non- human agents. The theory of 
inference that we have developed in part I is claimed to be true of all kinds of 
agents that satisfy the assumptions for A which we have tried to make explicit. 
These agents include human beings, certain animals, and maybe also certain 
artificial agents. It is indeed the case that we have not assumed inferences to 
be similar to scientific justifications, but this would have constituted a much 
too restricted approach to inference anyway. According to our explication of 
the concept of inference in part I, inferences can principally occur on every 
level of cognitive complexity, and we hypothesize (this is an empirical hypoth- 
esis again) that they actually do so, and that virtually any kind of cognitive 
activity like perceiving, memorizing, deciding, acting, etc. is accompanied by 
inferences, in particular, by nonmonotonic inferences. In most of these cases, 
inferences will be low-level processes subserving the essential cognitive needs 
of an agent. Therefore, at the current state of our knowledge, successful dy- 
namical low-level system implementations of such inferences seem to be more 
promising than successful symbolic high-level computation implementations as 
far as practical considerations are concerned. 



13.2 The Cognitive Architecture of Ideal Agents 

13.2.1 General Remarks 

If the latter hypothesis is true, this also holds for justified low-level inferences, 
as they have been characterized in part II by means of a low-level notion of 
justifiedness. This notion has been particularly designed so as not to exclude 
ideal low-level agents by apriori considerations. The question which now arises 
is this: are there any (perfectly or approximately) ideal low-level agents in 
the sense of part II that are additionally powerful, i.e., who are capable of 
drawing a “great number” of relevant justified inferences? If the answer is 
yes, then the results reported in the last section indicate that the cognitive 
architecture of such an agent is more probably a low-level dynamical system 
architecture, in particular, a neural network architecture, than a high-level 
symbolic one. The question of whether there is an ideal low-level agent at all is 
an important one, because we might be confronted with the following dilemma: 
(i) symbolic agents might indeed be possible architectures for ideal agents in 
principle, but their complex high-level style of computation does not allow for 
ideal, symbolic computation real-world-digents that are also subject to practical 
constraints like time and memory feasibility; (ii) neural networks, on the other 
hand, might be practically superior to symbolic computation agents due to a 
low-level architecture that is composed of simple nodes and connections, but 
they might at the same time fail to be ideal agents for principal reasons, since 
the generation of justified inferences might demand symbol manipulation. If 
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these two claims were true, this would indicate that ideal agents cannot exist 
in the real world, but just by the grace of philosophical ink. Let us now deal 
with these two claims while restricting ourselves to primary justification, and 
to inferences from perceptual beliefs only. 

13.2.2 Ideal Symbolic Computation Agents and Why They Fail for Practical 
Reasons 

Let us first point out that there are some good reasons for accepting the first 
claim: let OTact = {^act^^act^^act^ - •) be the actual model that is 

associated with our epistemological sandbox territory; each of the component 
models is based on the same set Wact of worlds (defined by the set Dact of 
objects). An ideal symbolic computation agent A might now be constructed 
in the following way: take three knowledge bases, i.e., memory storages for 
symbolic expressions: two factual knowledge bases for sentences in £, and one 
conditional knowledge base for sentences in or where => is either a 
high probability or a normic conditional. Subsets of ^^(^acD 

- depending on what kind of conditional has been selected - are stored 
in the conditional knowledge base, and A has the central occurrent general 
belief that [a[x] ~^/=> P[x] is true] if and only if a[x] l3[x] is stored 

in the (central) conditional knowledge base. Analogously, A has the occurrent 
perceptual belief that [a[a] is true] if and only if a[a] is stored in the first 
factual knowledge base, the perceptual factual knowledge base. If inferential 
activity is to be started, A joins the entries of the perceptual knowledge base 
by conjunction: let ^[a] be the factual formula that arises in this way. Now A 
applies the rules of the system M (of part III) to the universal conditionals in 
the conditional base, and the rules of one of the nonmonotonic systems to the 
defeasible conditionals in the conditional base (and in the case of the rules of 
Left Logical Equivalence and of Right Weakening to both). In the probabilistic 
case, the rules of the system P are used; in the normic case, the rules of one of 
the systems C, CL, P are used, where P again seems to be the best option. Every 
conditional that is derived in this way, is stored in the conditional knowledge 
base again. Let us suppose that A iterates this procedure (but note that there 
are actually more effective procedures available in the relevant literature) , that 
A applies the logical rules in all combinatorially possible ways, and that the 
conditional knowledge base has infinite memory capacity. By the theorems of 
part III, every derived conditional is true in Tlact = • • •)• 

If a conditional (p[x] — 'iplx] is derived in this way, 'tp[a\ is stored by A in 
the second factual knowledge base, the central factual knowledge base, s.t. A 
has the central occurrent belief that [ip[a] is true] if and only if '^[a] is stored 
there. Thus, if (p[x] ~^/=> 'iplx] is stored in the conditional knowledge base, A 
also has the dispositional general belief that [(p[x] -0[x] is true], since A 

is disposed to draw a direct inference leading to the belief that [ip[a] is true] 
under the circumstances that (all that) she perceptually believes is that [(f[a] 
is true]. By applying our definitions in part II, A indeed draws deductive and 
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high probability/normic inferences in this way, and the derivation process of 
conditionals from further conditionals is a monotonic/nonmonotonic reasoning 
process. By theorem 55 in part II, A even draws primarily justified deductive 
and high probability/normic inferences. If the set of conditionals by which 
the inference process is initiated is acquired by a justified learning process, A 
even draws these inferences justifiedly simpliciter, since the inferences are first- 
order reliable, and the monotonic/nonmonotonic reasoning processes that the 
latter inferences are based on are second-order reliable (recall our definitions 
of reliability in part II). If the initial set of conditionals is sufficiently large, A 
might even draw every possible justified inference in such a way. 

An example: let the perceptual knowledge base contain just the formula 
Bird(a), since all that A has perceived is that there is a bird. The conditional 
knowledge base contains the following conditionals: Bird{x) =^hp CanFly{x), 
Bird{x) ^hp Wings{x)^ Penguin{x) =^hp ~^CanFly{x), Penguin{x) 
Bird{x), representing the contents of general beliefs. Now the following might 
happen: 

1. A reasons by the And- Rule: 

Bird{x) ^hp CanFly{x), Bird{x) =^hp Wings{x) 

Bird{x) =>hp {CanFly{x) A Wings{x)) 

2. A adds CanFly{a) A Wings{a) to the central factual knowledge base. 

In this case, A has nonmonotonically inferred CanFly(a) A Wings{a) 
from Bird{a). 

Now consider a different situation in which the perceptual knowledge 
base only contains the formulas Bird{a) and Penguin{a)^ since all that A has 
perceived is that there is a bird that is a penguin. The algorithm starts again, 
and at some computation step the following sequence might be generated: 

1. A reasons by the Refiexivity-Rule: 



Penguin{x) =>hp Penguin{x) 

2. A reasons by the Right- Weakening- Rule: 

Penguin(x) =^hp Penguin{x)^Penguin{x) Bird{x) 
Penguin{x) ^hp Bird{x) 

3. A reasons by the Cautious-Monotonicity-Rule: 

Penguin{x) =^hp Bird{x), Penguin{x) ^hp ^CanFly{x) 
Penguin{x) A Bird{x) =^hp -^CanFly{x) 

4. A adds -^CanFly{a) to the central factual knowledge base. 
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Thus, A has nonmonotonically inferred -^CanFly{a) from Bird{a) A 
Penguin{a) (for the rules of inferences that are used here see chapter 10 of 
part III). 

The recursive enumeration algorithm that we have sketched above is 
of course unrealistic concerning time and memory capacity, because it might 
both take too long and too much memory until a certain inference is drawn 
by the symbolic computation agent. If the conclusion of such an inference is 
important for the agent’s needs, the agent may fail to attain her goals; in the 
worst case, she may be destroyed. 

A different kind of procedure by which inference in symbolic compu- 
tation agents may be implemented is by a decision routine: let (p[a\ be again 
the conjunction of the entries of A’s perceptual knowledge base. A computes 
a (relevant) query ip[x] —^/=^ s.t. the task is now to decide whether 

(f[x] ~^/=> '0[x] is entailed (in the sense of one of the semantical systems of 
part III) by the conditional knowledge base; if that is the case, '0[a] is added to 
the central factual knowledge base and A has thus inferred ^[a] from (f[a\. It is 
known that such a semantically oriented decision problem for, say, preferential 
ent ailment, is co-NP-complete and thus as hard as the unsatisfiability decision 
problem for propositional formulas (see Lehmann&Magidor[88], p.l6)^. This 
inference-by-decision algorithm is again an extremely general symbolic compu- 
tation architecture and works for every possible perceptual knowledge base and 
for every possible conditional knowledge base, i.e., for every possible environ- 
ment under every possible probability measure or normality ordering. This high 
degree of generality is paid for by computational complexity, and therefore the 
architecture might lack practical applicability in a real-world domain like, e.g., 
the cognition of a robot agent in a hostile and quickly changing environment. 

These results do of course not prove that the practically feasible imple- 
mentation of ideal agents by symbolic computation is impossible^ but it shows 
that there are good reasons to believe that the high level approach and the 
degree of generality that are usually characteristic of symbolic computation 
architectures interfere with practical constraints. 

Let us now deal with claim (ii) from above: we will show in the sub- 
sequent chapters that inferentially ideal agents may indeed be implemented 
by simple neural network architectures while still being practically adequate. 
Thus, there are ideal agents that are capable of coping with the real world, 
and, given that Smolensky’s sub-symbolic hypothesis turns out to be true, 
their architectures are (highly) simplified abstractions from the brain architec- 

^But, as Lehmann&Magidor prove, the decision problem is polynomial in the case of Horn 
assertions. As Lehmann&:Magidor[88], p.41, also show, the decision procedure for rational 
closure is essentially as complex as the satisfiability problem for propositional formulas. For 
a comprehensive overview on such results, see Eiter&:Lukasiewicz[43]. 
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tures that we use. For the sake of simplicity, we will restrict ourselves in what 
follows to nonmonotonic inferences only. 

Here is the plan of the subsequent sections and chapters^: in the first 
section of the next chapter we recall some relevant background information 
concerning networks and nonmonotonic inferences. In the second section of 
chapter 14 inhibition nets are defined, and a basic lemma on their topology 
is derived. The third section demonstrates that inhibition nets may be re- 
garded as certain dynamical systems with convenient properties. Chapter 15 is 
devoted to the interpretation of these dynamical systems as cognitive agents 
which have beliefs and which draw nonmonotonic inferences. In chapter 16 we 
present and prove (i) a soundness result stating that the inferences drawn by 
so-called cumulative-ordered interpreted inhibition net agents obey the rules of 
the system CL (introduced by KLM[85], pp. 186-189, and discussed in part III) 
of nonmonotonic logic for normic conditionals, and (ii) a completeness result 
stating that for every agent which is disposed to draw inferences obeying these 
rules there is a cumulative-ordered interpreted inhibition net agent which is 
disposed to draw precisely the same inferences. In the appendix we add cor- 
responding results for the systems C, P, CM, and M, relative to other classes 
of inhibition net agents. In chapter 17 some implications of our results are 
discussed, in particular, concerning the realizability of ideal agents by cogni- 
tive inhibition nets, and cognitive artificial neural networks. Chapter 18 deals 
with the relationship between inhibition nets and some symbolic nonmonotonic 
reasoning mechanisms, particularly logic programs which employ “negation as 
failure”. Chapter 19 relates inhibition nets to more usual kinds of artificial neu- 
ral networks, and shows that our results may be transferred also to networks of 
the latter kind. Chapter 20 summarizes the obtained results and defends these 
results against some possible objections. 



II The following chapters on inhibition nets and their network semantics appeared already 
in a condensed version in Leitgeb[90], but without any epistemological interpretation or 
justification. 




Chapter 14 

INHIBITION NETS AS SIMPLE NEURAL NETWORKS 



Several parts of the contents of this chapter and also of chapters 15-19 can be 
found in Leitgeb[90], though sometimes stated in different terminology. 



14.1 Background; Nonmonotonic Inferences and Networks 

Throughout the following chapters we will partially employ the typical termi- 
nology of dynamical systems theory and of connect ionism, and we will introduce 
abstract surrogates of some of the concepts used in these fields. At the same 
time we will see that the network agents that we consider reason according to 
a set of symbolically represented rules of a system of nonmonotonic logic. This 
logical treatment of network states and processes is, of course, not at all a new 
idea: the seminal paper by McCulloch&Pitts[106], which is in this respect a kind 
of paragon for our approach, explicitly tries to treat neural events and relations 
by means of propositional logic. Since its publication in the 1940s the McCul- 
loch&Pitts model has been critized for various reasons: (i) the networks used 
are neurobiologically implausible, (ii) as we have already pointed out, single 
nodes are interpreted as representing propositions and thus localized represen- 
tations are employed rather than distributed ones as connectionism has it, and 
(iii) classical propositional logic is used on the symbolic side where perhaps a 
system of nonmonotonic reasoning would be more adequate since such systems 
are supposed to be closer to the highly but not absolutely reliable commonsense 
reasoning that our brains are usually involved in. In the following sections we 
will introduce a class of networks for which claim (i) is still true - which we 
accept for the sake of abstraction and simplicity -, but the subsequent results 
will show that the objections expressed in (ii) and (iii) would fail if they were 
directed towards our approach. 

The combination of neural networks with nonmonotonic inference has 
also been suggested by a handful of other authors: Valiant [171], chapter 13.5, 
discusses the nonmonotonic phenomena occurring in both commonsense rea- 
soning and neural network design. Schurz[143], p.59, explicitly compares the 
computations of biological and artificial neurons to the applications of default 
rules in nonmonotonic reasoning. Both papers state the general idea, but omit 
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details and results. The most extensive treatment of this topic is to be found 
in Balkenius&Gardenfors[15] and Gardenfors[57], where it is shown that state 
transitions in artifical neural networks may be considered as nonmonotonic in- 
ference processes. The ideas introduced in the latter two papers have been a 
major source of inspiration for various of the following sections, in particular 
for our net semantics of defeasible conditionals. As indicated by [15] and [57], 
the inferential dispositions within many neural networks lack some of the clo- 
sure properties that we have found to be characteristic of ideal agents at the 
end of part III (like cumulativity ) . Nets with so-called “shunting” interaction 
of inputs (see [15], p.33) are hypothesized to satisfy cumulativity, however, this 
is not proved by the authors, but only suggested by extensive computer simu- 
lations. Put shortly, the results in [15] and [57] are restricted in certain ways, 
and these restrictions seem to be due to the complexity of signal propagation in 
arbitrary neural networks. This is one reason why we restrict ourselves to the 
simpler case of inhibition nets (and, in chapter 18, to a certain subclass of the 
class of artificial neural networks). See Leitgeb[92] for a more general account 
according to which all standard artificial neural networks can be considered as 
so-called “interpreted ordered dynamical systems” which are disposed to draw 
nonmonotonic inferences; the subsequent theory is but a special case of the 
latter generalized view. 

14.2 Inhibition Nets 

Inhibition nets are directed graphs with two kinds of edges: (i) edges between 
nodes, and (ii) edges between nodes and edges of type (i): 

Definition 115 (Inhibition Nets) 

1. Let N be a non-empty set (the set of nodes) 

2. let E C N X N (the set of excitatory connections) 

3. let I C N X E (the set of inhibitory connections) 

f. let bias E N be fixed (the bias node). 

Then X = (AT, E', /, bias) is an inhibition net(work). 

In the following we use ‘m’ and ‘n’ (with or without indices) as variables 
ranging over nodes, ‘e’ for edges; ^k\ will always range over 

natural numbers. We will say that m E n {or m I e), when we actually mean 
that (m^n) E E (or (m, e) E I). 

It may help to think of nodes as neurons, of the excitatory connections 
between nodes as excitatory connections between neurons, and of inhibitory 
connections as formal counterparts of presynaptic inhibitory connections (see 
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e.g. Eccles[42], pp. 124-127). By means of the latter, neurons may inhibit excita- 
tory connections between other neurons without inhibiting the target neurons 
of such connections themselves. But inhibition nets are of course far from be- 
ing plausible models of real neural assemblies. Moreover, inhibition nets are 
also quite different from the usual artificial neural networks, since there are no 
weigths associated with the connections, there are no inhibitory connections 
from nodes to other nodes, and, as will be seen later, there are no continuous 
activation states for nodes, no weighted input summation within nodes, and no 
complex activation functions. Finally, no use is made of learning procedures, 
as it is usually the case in neural network design. 

The bias node bias will be the only node which is active in every state 
of the network. Thus we may assume that there is no n E N s.t. n E bias, since 
excitatory connections to bias would be without any use anyway. 

Furthermore we need the following concepts: 

Definition 116 (Paths in Inhibition Nets) 

Let X = {N, E, I, bias) be an inhibition net: 

1. a path is a sequence no, . • . , n/e (k > 0) of nodes, s.t. for all i E {0, . . . , 
k-1}: 

Hi E or there is an n E N s.t. Ui I (n,ni^i). 

We will say that such a path has length k. Generally, if mi E m 2 , or 
there is an n E N s.t. mi I {n,m 2 ), we will say that mi is connected to 
m 2 (in this order) 

2. an E-path is a path no, . . . , n^ of nodes, s.t. for all i E {0, . . . , k — 1} : 

Hi E Ui^i 

3. a cycle is a path uq, ... ,Uk where no = n^. 

Definition 117 (Hierarchical Inhibition Nets) 

Let X = {N, E, I, bias) be an inhibition net. 

X is hierarchical iff it does not have cycles. 

Hierarchical inhibition nets will be shown to have the property that 
none of their nodes has infiuence on its own activity, neither by excitation nor 
by inhibition. However, such nets may e.g. contain nodes mi, m 2 , ni, U 2 , s.t. 
mi I {m 2 , U 2 ) and m 2 I {mi, ni): in such a case mi and m 2 will mutually 
inhibit each other’s spreading activity. But such a form of mutual inhibition is 
not circular or non-hierarchical in the sense specified above. 

Contrary to arbitrary inhibition nets, hierarchical inhibition nets will 
be proved to have unique stable states of activity given a constant input state. 
This is the reason why we will mainly concentrate our efforts on them. More- 
over, we will only focus on finite hierarchical inhibition nets (FHINs) for the 
sake of simplicity, and because they are the practically relevant ones. 

Definition 117 obviously implies: 
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Remark 118 

If riQ, ... is a path in an FHIN, then Ui ^ Uj for i ^ j, because 
otherwise there would be cycles. Thus in an FHIN there are only finitely many 
paths between two nodes, since there are only finitely many nodes. 

In figs. 1 and 2 you can see two FHINs (in the first case we have 
omitted the bias node graphically), which we will use as examples throughout 
the subsequent sections. They are defined in the following way: 

Example 119 Xi = {Ni, Ei, Ii,bias) , s.t. Ni = {6ias, ni, ri2, ns, 714} is a set 
of 5 nodes, n\ Ei U2, ni E\ ns, I\ {n\,n2), and there are no other connec- 
tions. 



Hh r\2 




Fig. 1: Xi 



Example 120 I2 = {N2, E2,l2,bias) , s.t. N = {bias,ni,n2,ns}, bias E2 ni, 
U2 E2 ns, ns I2 {bias,ni), and there are no other connections. 



ng Hi 




Now we will state and prove a lemma, which will give us some basic 
information about the topology of FHINs: 

Lemma 121 (Canonical Partitions of Inhibition Nets) 

Let X = {N, E, /, bias) be an FHIN: then there are disjoint and non- 
empty sets No, . . . , Nk, s.t. N = NqU . . .U Nk and 

1 . for all m, n ^ N, s.t. there is a path from m to n, it holds that: m E Ni, 
n G Nj for i < j, 

2 . for every n ^ N \ Nq there is an m ^ Nq, s.t. there is an E-path from m 
to n. 



3 . No is the set of all nodes with (E-)indegree 0, i.e. the set of nodes without 
excitatory connections leading to them. 
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4^ bias G Nq. 

Proof: 

• (N^E) is a finite directed acyclic graph (DAG), and thus it has a unique 
non-empty point base Nq G N where Nq is the set of all vertices with 
indegree 0 (see e.g. Harary [78], p.201), i.e. for every node n e N \ Nq 
there is a node m E Nq s.t. there is an E-path from m to n, and there is 
no E-path between nodes in Nq. 

• Now let Ni be the set of nodes n e N s.t. the maximal length of a path 
from a node in Nq to n is i (for all i with 1 ^ i). Since every node 
n E: N \ Nq is reachable from a node in Nq by our selection of Nq, and 
since there are only finitely many paths from nodes in N\Nq to such an 
n (by remark 118), there is indeed a path with maximal length to n, i.e. 
there is an i s.t. n e Ni. By maximality we have Ni n Nj = 0 for i ^ j. 

• Furthermore, if Ni is not empty for some i > 0, then also iV^_i is not 
empty: 

for there is an n e Ni s.t. there is a maximal path m = uq, . . . , n^-i, = 

n where m E Nq; but then m = uq, . . . , n^-i is also a maximal path from 
a node in Nq to n^-i and thus Ni-i ^ 0, or else there would be a longer 
path mo, . . . , ruk — rii-i with ttiq G Nq, k > i — 1; in the latter case Ui 
would not be one of the nodes in uiq, . . . , mk, since otherwise there would 
be cycles in X; it would follow that thq, . . . ,mk,ni is a path with length 
k 1 > i (uik = rii-i is connected to Ui by the existence of the path 
no, . . . , Ui-i,ni); but this contradicts that n E Ni. 

• This implies that for some k 0 the sets Nq, ... ,Nk are disjoint, non- 
empty, and N = NqU . . .0 Nk. 

• Now suppose m, n e N s.t. there is a path from m to n, where m e Ni, 
n G Nj: the longest path tuq, . . . ,mi = m from a node tuq in Nq to m 
has length i; by assumption there is a path m = uq, ... ,Uk = n from m 
to n; mo , . . . ,mi and uq, ... ,Uk have no nodes in common except m, or 
else there would be cycles. Thus by concatenating the two paths we see 
that the longest path from a node in Nq to n has length j > i. 

• The last three claims from above hold because of our selection of Nq, and 
our assumption that there are no excitatory connections to bias. ■ 

Lemma 121 shows that FHINs are layered (if n e Ni then we will say 
that n is within the layer i). We are going to see that in FHINs the activity of a 
node in layer i solely depends on the activity of nodes in layers j < i. Formally, 
it is easier to deal with FHINs than with arbitrary inhibition nets just because 
of this layered structure. Lemma 121 leads to the following definition: 
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Definition 122 {Nq^ . . . , N^) (as given by lemma 121) is the canonical parti- 
tion ofX. 

Canonical partitions are in fact determined uniquely, but this is not 
important in the following. 

Example 123 

In the case of Ti we get the canonical partition {{hias^ ^ 4 } , {^2 5 ^s})* 
In the case 0 /X 2 the canonical partition is {{hias.n 2 } , {^ 3 } , 

The layered structure of FHINs is analogous to the layered structure 
of various neural structures and artificial neural networks. In the latter case, 
however, the nodes of a layer i may usually only be connected to the nodes 
of the subsequent layer i + 1, whereas in our case the nodes of layer i may be 
connected to the nodes of layers with any index larger than i, or to excitatory 
connections leading to nodes of such layers. 

Lemma 121 entails the following straight forward though important 

remark: 

Remark 124 

Let n ^ Nil 

1. 3m G N s.t. m E n iff 

3m G Nj with j < i s.t. m E n. 

2. 3m' G N s.t. m' I {m^n) (for some m G N) iff 

3m' G Nu with u < i s.t. m' I {m,n) (for some m G N). 

14.3 Inhibition Nets as Dynamical Systems 

In analogy to usual neural networks also inhibition nets may be considered as 
dynamical systems. First of all, we postulate that the nodes of inhibition nets 
may have a certain activity. We will restrict the types of such activity states of 
nodes to discrete binary states, i.e. to 1 (“on”) and 0 (“off”). 

We assume that nets are fed by inputs which dictate certain nodes to 
fire independently of the current net state. This is the external causal dynamics 
of inhibition nets. Typically, these inputs may be thought of as being caused 
by sensory inputs. The only generalization we apply compared to the inputs to 
usual neural networks is that we allow inputs to affect the whole network and 
not just a distinguished layer of input nodes. 

The internal causal dynamics of inhibition nets is the evolution of states 
determined by the input and the topology of the network. The main rule gov- 
erning the state transitions within inhibition nets is: a node n is excited if 
and only if (i) it is directly excited by the input, or (ii) there is an excitatory 
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connection e from a further node m to n, s.t. m is itself active and e is not 
inhibited by yet another active node which is inhibitorily connected to e. Thus 
the nodes which are connected inhibitorily to excitatory lines have a function 
similar to undercutting def eaters (cf. Pollock[124]) in defeasible logic, though 
on a completely different level of computation. In the next chapter we are go- 
ing to use finite hierarchical inhibition nets as the central systems of cognitive 
network agents, where the dispositional central system is given by the topology 
of a fixed inhibition net, and therefore never changes its state. We will use the 
index ‘c, o’ as we have done in the previous parts, and as we will do in the 
appendix, in order to denote the parameter-settings of A^s occurrent central 
subsystem. This is justified in the next chapter. 

Put formally, the dynamics of inhibition nets is defined thus: 

Definition 125 (Dynamics of Inhibition Nets) 

Let X = (AT, /, bias) be an inhibition net 

Let 5^’^ = : N {0, 1} with s^^^{bias) = 1} 6e the space of 

parameter- settings of the net X (we will generally omit the reference to X and 
just use the notation for the sake of simplicity). 

Let s* G 5^’^ be an arbitrary parameter- setting of X ( the input to the 
central system; as we will see in the next chapter, s* should be thought of as 
being determined by the parameter- setting of an associated perceptual system): 

let Fs* : 5^’^ s.t. for aline N\ {bias}: F,* (5^’^)(n) = 1 iff 

1. s*{n) = 1, or 

2. 3ni G AT(5^’''(ni) == l,ni n, -i3ri2 G A/'(5^’°(n2) = l,ri 2 / 

Then Eg* is the state transition function given relative to the input 5* 
and the net X (we will again omit the reference to X). 

If is a parameter-setting of X and = 1, we say that n fires or 

that n is active (in A set of nodes is called ‘active’ if each of its members 
is active. Prom time to time we will identify a parameter-setting (which is 
a function) with the set of neurons active in the very state: e.g. if we say that 
C ^2 ^ we actually mean that for all n e N: if 5i’°(n) = 1 then 52 °(n) = 1. 

The ‘if’ direction of the clause for above says that if a node is 
caused to fire, it indeed fires; on the other hand, the ‘only if’ direction states 
that a node should only fire if it is also caused to fire. The inhibition of an 
excitatory connection is always dominant over the simultaneous impulse within 
the very excitatory connection. Note that the bias node fires in every parameter- 
setting 5 ^’^. The bias commits the net to a certain preferred parameter-setting 
of minimal energy which the net always reaches in the case of lacking “stress” , 
i.e., input from outside. Such bias nodes are also employed in some of the usual 
neural networks. 
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For each G 5'^’^ (and each given input s* G the iterated 

application of Fs* defines the following trajectory of parameter-settings: 

^ = 0 : 5 *^’^ 

t = 3 : F3 = F,.(F,.(F,.(5‘=-°))) 



F^* ( 5 ^’°) may be considered as the net parameter-setting at time k given that 
has been the initial parameter-setting at time 0, and given an input 5 *, 
which is considered to be constant for a sufficient amount of time. (S^^^^Fs*) 
is a discrete dynamical system which is associated with the input s* and the 
net X. Fs*))^*^gc,o is a family of discrete dynamical systems associated 

with X. Dynamical systems such as are called ‘discrete’ since they 

evolve in discrete temporal steps of unit duration. 

At time 0 the net is in a certain parameter-setting and an input s* 
is fed into the network, s* excites a certain set of nodes and consequently each 
of these nodes fires at time 1. Since the input is considered to be invariant for 
a sufficient temporal duration, this happens over and over again at each of the 
subsequent steps. Additionally, the bias node fires at each point of time. The 
“energy” emerging from the input and the bias node spreads from every step 
t to the subsequent step t -h 1 via the excitatory connections, but only if these 
connections are not inhibited by the activity of further nodes at time t. 

E.g. consider Xi: if n\ is the only node that fires at time 0, n 2 is caused 
to fire at time 1 , but if both ui and 724 fire initially, then 722 does not fire at 
the next step due to inhibition. 

In the case of an FHIN the second part of the right hand side of the 
‘iff’ clause in def.125 may be put into a different shape by the following remark: 

Remark 126 

Let n E Ni (where X has the canonical partition {Nq, . . . ^Nk), as it 
will be assumed for the whole of this section): 

3t2i G N {s^^^{n\) = 1, 72i E* 72, ->3722 G A/” (s^’^(t22) = 1,722 / (t2i,72))) 

iff 

3ni G Nj with j < i s.t 

(s^’^(t 2 i) = 1, n\ E 72, -i3t22 G Nu with u < i{s^^^{n 2 ) = 1, 
ri 2 I {ni,n))). 

This is a consequence of remark 124^ 

Due to their hierarchical structure FHINs can be shown to possess a 
unique stable [resonant., equilibrium) parameter- setting for each input s*. Here 
we use the following definition of a stable parameter-setting: 
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Definition 127 (Stable Parameter- Settings in Inhibition Nets) 

5^’^ is a stable parameter-setting under input s* iff Fs*{s^^^) = , 

i.e. if s^'^ is a fixed point of Fs * . 

E.g. (1, 0, 0) is the stable parameter-setting of X 2 for the input (0, 0, 0). 
Note that a parameter-setting 5*^’^ may be a stable parameter-setting under the 
input si and a stable parameter-setting under the input S 2 at the same time 
although ^ 52- 

The “stability property” of FHINs is stated by the following theorem: 

Theorem 128 (Stability Property) 

For every FHINX = {N, /, bias), for every s* G 5^’^ there is exactly 

one stable parameter- setting 5 ^’® ofX under the input s* . 

Proof: 

1. First we show existence: 

let 5 ^’^ : iV ^ {0, 1} be inductively defined, s.t. 

• for all n e No : s^'^{n) = 1 iff {s*{n) = 1 orn = bias) 

• for all n e Ni (i > 0): s^^^(n) = 1 iff (s*(n) = 1 or 3ni G Nj with 
j < i(5^’°(ni) = 1, ni E" n, -i3n2 G Nu with u < i{s^^^{n 2 ) = 1, 
ri 2 / (ni,n)))). 

5*^’^ is a stable parameter- setting under input s* (we use remark 126): 

• for all n G Nq: s^'^{n) = 1 ij9*(s*(n) = 1 orn = bias) iff Fs*{s^'^){n) = 

1 

• for all n ^ Ni (i > 0): s^'^{n) = 1 iff (s*(n) = 1 or 3ui G Nj with 

j < i{s^^^{ni) = 1, ni E n, -<3n2 G Nu with u < i(s^’^(n 2 ) = 1, 
U 2 I (ni,n)))) iff, by remark 126, 

(s*(n) = 1 or 3ni G N with 

= 1, m E n,^3u2 G N{s^^^{u 2 ) = 1,U2 I (ni,n)))) iff 
E,.(5^’")(n) = l. 



2. Uniqueness may be shown inductively: let be stable under 5 *, but 
s'c,o ^ ^c,o ^c,o defined in 1). Since is stable, s'^'^{n) = 
Fs*{s'^'^){n), and we use again remark 126: 

• for all n G Nq: s'^^^{n) = 1 iff Fs*{s'^^^){n) = 1 iff (5*(n) = 1 or 
n = bias) iff s^^^{n) = 1 
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• assume that for all n e Nq U ... U Ni^i: s'^^^{n) = s^^^{n): 

then for all n G s'^^^{n) = 1 iff Fs* (s'^’^){n) = 1 iff (s*{n) = 1 
or 3n I G Nj with 

j < i{s'^^^{ni) — 1, ni E n, -i3n2 ^ Nu with u < i{s'^^^{n 2 ) = 1, 
^2 I (^1?^)))) '^ff^ induction hypothesis, 

(5*(n) = 1 or 3ni G Nj with 

j < i{s^'^{ni) = 1, ni E n, -i3n2 ^ Nu with u < i{s^^^{n 2 ) = 1, 

U 2 I (ni,n)))) iff 

Fs^{s^^^){n) = l iffs-^%n) = l. ■ 



Lemma 128 justifies the following definition: 

Definition 129 (Closure Operator for Inhibition Nets) 

For every FHIN X let Cl : 5*^’^ s.t. Cl{s*) is the unique stable 

parameter- setting under input s* (actually, Cl = Clj, but we will often drop 
the index X’). Cl is the closure operator of X, C/(5*) is the closure of s* . 

In table 1 we have listed the closure parameter-settings for the example 

net Xi : 



Input: Closure: 



( 0 , 0 , 0 , 0 ) 


( 0 , 0 , 0 , 0 ) 


( 0 , 0 , 0 , 1 ) 


( 0 , 0 , 0 , 1 ) 


( 0 , 0 , 1 , 0 ) 


( 0 , 0 , 1 , 0 ) 


( 0 , 0 , 1 , 1 ) 


( 0 , 0 , 1 , 1 ) 


( 0 , 1 , 0 , 0 ) 


( 0 , 1 , 0 , 0 ) 


( 0 , 1 , 0 , 1 ) 


( 0 , 1 , 0 , 1 ) 


( 0 , 1 , 1 , 1 ) 


( 0 , 1 , 1 , 1 ) 


( 1 , 0 , 0 , 0 ) 


( 1 , 1 , 1 , 0 ) 


( 1 , 0 , 0 , 1 ) 


( 1 , 0 , 1 , 1 ) 


( 1 , 0 , 1 , 0 ) 


( 1 , 1 , 1 , 0 ) 


( 1 , 0 , 1 , 1 ) 


( 1 , 0 , 1 , 1 ) 


( 1 , 1 , 0 , 0 ) 


( 1 , 1 , 1 , 0 ) 


( 1 , 1 , 0 , 1 ) 


( 1 , 1 , 1 , 1 ) 


( 1 , 1 , 1 , 0 ) 


( 1 , 1 , 1 , 0 ) 


( 1 , 1 , 1 , 1 ) 


( 1 , 1 , 1 , 1 ) 



Table 1: Closure Parameter- Settings for Xi 

Stable (resonant) parameter-settings play an excellent role in the lit- 
erature on neural networks. Often they are considered to be the “answers” of 
neural networks to inputs (“questions”), and this is also our motivation for 
studying such parameter-settings. 
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FHINs do not only possess unique stable parameter-settings Cl(s*) for 
all inputs 5 *, but the parameter-settings of an FHIN may even be shown to 
finally converge to Cl{s*) under the constant input s* where the selection of 
the initial parameter-setting is irrelevant: 

Theorem 130 (Convergence Property) 

For every FHIN J = {N^ I^bias) , every input 5 * and every initial 
parameter- setting s^'^: 

\Nou...uNi^i= C/(5*) fiVou...uiVi_i (for 0 < i ^ k), i.e., 
FI* is identical to C/(5*) on all nodes in NqU . . .U Ni-i. 

Proof: 

By induction over indices i of the partition sets (we presuppose again 
remark 126): 

in theorem 128 we have constructed the closure state (7/ ( 5 *) recursively 
- now we use this construction: 

• for all n e No : Fs* = 1 iff (s*(n) = 1 or n = bias) 

iff, by theorem 128, Cl{s*){n) = 1, and thus \nq= Cl{s*) \nq 

• assume that F^^*{s^^^) Uou...uiv,_i= Cl{s*) \noU...uNj-^i for all j < i 
(with i ^ 2), i.e., for all n e NqU . . .U Nj-i it holds that F^* {s^'^){n) = 
C/(5*)(n); 

for a// n G A^o U . . . U Ni^i: F^*{s^^^){n) = 1 iff, by def.125 and remark 
126, {s*{n) = 1 or n — bias or 3ni G Nj with j < i, s.t. 

(Fl7^{s^^^){ni) = 1, ni E n,-^3u2 G Nu with u < i(FgV^(5^’®)(n2) 
= 1, U 2 I {ni,n)))) iff, by inductive hypothesis, 

(5*(n) = 1 or n — bias or 3n\ G Nj with j < i, s.t. 

{Cl{s*){ni) = 1, ni E n,-^3n2 G Nu with u < i(C/(s*)(n 2 ) = 1, 
U 2 I (ni,n)))) iff, by the construction of Cl {s'") in theorem 128 again, 

Cl{s^){n) = i.m 

Now we can specify what we have meant by qualifying an input as 
‘constant for a sufficient amount of time’, as we have done on p.232: if i > fc 
where k is the number of layers of J, then = C/(s*), i.e.: if the 

input is constant for more than k units of time, the net definitely converges 
to a stable parameter-setting which is only dependent on the input. This also 
entails that the iterated application of Fg* does not generate any cycles apart 
from the single loop F^* (C/(5*)) = Cl{s*) of length one. In the terminology of 
dynamical systems we might say that C/(s*) is the only periodic parameter- 
setting of {S^'^,Fs*). 
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Due to the presence of inhibitory connections the state transitions in 
inhibition nets are generally not monotonic, i.e. if C then it does not 
necessarily follow that also C Cl{s 2 ^). E.g. in the case of X\ we have 

{ni} C {ni,ri4} but Cl{{ni}) = {ni,U2,n3} ^ {ni,n3,n4} = C/({ni, 724}).* 
But, obviously, at least the following holds: 

Remark 131 

1 . For every FHIN X — (N, E, I.hias) , for every parameter- settinq 

C (Inclusion) 

2. For every FHIN X = {N^ E, I,bias) , for every parameter- setting 
Cl{s^^^) = Cl{Cl{s^'^)) (Idempotence) 

3. For every FHIN X = (N, E^ 0, bias) (i.e. without inhibitory connections): 
the operator Cl is monotonic 

4 . For every FHIN X = {N^ E, I^bias) , for every parameter- setting s^'° : 
s^^«niVo = c/(5^’^)niVo. 

As a kind of substitute for monotonicity the following two weakenings 
of monotonicity may be proved: 

Lemma 132 ( ‘‘Cumulativity” ) 

For every FHIN X = {N, E^ I,bias) , for all parameter- settings 5 ^’^, 
ifsl'^ C 4" C then = C/(4"). 

Proof: 

Assume that s^’^ C ^2 ° C Cl(s^’^); by induction over the partition sets 
(and we use remarks 126 and 131): 

1 . (Induction Basis) 

A^o n Cl{s^(^) = Ao n Cl{s 2 ^), since A^o H 5^’^ — Cl{s^(^) (by remark 

131), 

and ATq n s^'^ Q Nq H 82 ^ ^ Nq D Cl{sl'^) by assumption, it follows that 
= NonCl{sl^^), 

but No n 82 ^ = Nof] Cl{s 2 ^) by remark 131 again. 

2. (Induction Step) 

Now assume that for all m G A^o U . . . U AT^ (i < k): C/(sJ’^)(m) = 
Cl{s 2 ^){m). Letn G A^z-fi.* ifn G 5^’° C then Cl{s^i^){n) = Cl{s 2 ^){n) 
= 1 and we are done. 

*This example is stated incorrectly in Leitgeb[90], p.l71. 
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Thus suppose n ^ s^'^: Cl{si^){n) = I iff 
3ni G Nu with u < i 1 s.t. 

{Cl{si ^){ni) = rii E n, -i3n2 ^ Ny with v < i l{Cl{sl'^){n 2 ) = 1, 
^2 I iffy by inductive assumption, 

3ni G Ny with u < i + I s.t. 

{Cl{s 2 ^){ni) = l,ni E n,-^3u2 G Ny with v < i 3- l{Cl{s 2 ^){n 2 ) = 1, 
I which implies that Cl{s 2 ^){n) = 1; 

on the other hand, if Cl{s 2 ^){n) = 1 then either S 2 ^{n) = 1 and by our 
assumption that S 2 ^ C Cl{sl'^) we have that Cl{si^){n) = 1 , 

or else S 2 ^{n) = 0 and thus 3ni G Ny with u < i 3- 1 s.t. 

{Cl{s 2 ^){ni) = l,ni E n,-^3u2 G Ny with v < i 3- \{Cl{s 2 ^){n 2 ) = 1, 
^2 I but then by inductive assumption also 3ni G Ny with u < 

i “h 1 s.t. 

{Cl{sl'^){ni) = I, ni E n,-^3u2 G Ny with v < i 3- l{Cl{si^){n 2 ) = 1, 
U 2 I {ni,n))), 

and therefore again Cl{si^){n) = 1 . 

So we have that Cl{sl'^){n) = 1 iff Cl {s 2 ^) {n) = 1. M 

If Cl were an operator on sets of formulas it would thus be called 
‘cumulative’ according to common usage (see Makinson[99], p.43). 

Lemma 133 (“Loop”) 

For every FHINX = {N, E, /, bias), for all parameter- settings Sq^, . . . , 



ifsr c ciisin.sT Q ci{sin.-^.^j ^ 

then = Cl{s')f) for r, r' G {0, . . . ,j} . 



CC/(s;L^),sr 



Proof: 

Assume that C CI{sq^), 82 ^ C (7/(5^’^), . . . , 5 ^’^ C Sq ^ C 

Cl{s^'^). Now by induction over the partition sets (within this proof, summation 
of indices is understood modulo j 3-1): 



1. (Induction Basis) 

again by remark 131 we have Nq fl C/(5 q Nq C\ Cl{sl'^) D ... D 
No n DNon Cl{sl^^), and thus No H C/(4’^) = NoH (7/(5^’^) = 

... = Nonci{s^p. 

2. (Induction Step) 

Suppose that for alln G (A^o U . . . U Ni) with i < k: C/(s^’^)(n) = Cl{s^f){n) 
for r, r' G {0, . . . J}; let n G r G {0, . . . J}: 
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assume that Cl{s^'^){n) = 0; then n ^ s^'^ and -i3m G Nu with 
u < i l{Cl{s^^^){m) = 1, m n, -^3m' G Ny with v < i 3- 1 s.t. 
(C/(s^’^)(m') = l,m' I (m, n))). Now suppose = 1; by in- 

ductive assumption -i3m G Ny with u < i 3- 1 s.t. 

{Cl{s^'^i){m) = 1, m ^ n, ->3m' G Ny with v < i 3- 1 s.t. 

= l,m' / (m, n))); therefore, n would have to he a mem- 
ber 0 / 5 ^’^P But since C Cl{sp^){n) would be 1, contradict- 

ing Cl{s^^^)(n) = 0. Thus also Cl{s^'^-^){n) — 0 and by the same reasoning 
pattern Cl{s^f){n) = 0 for all r' G {0, . . . , /}. If Cl{s^^^){n) = 1 for some 
r then there cannot be an r' s.t. Cl{s^ff^){n) = 0^ because otherwise we 
would get a contradiction by inferring that Cl{s^'^){n) = 0^ as before. So, 
Cl{s%'°){n) = Cl{slf){n) for r, r' e {0, . . . ,j}. ■ 



If Cl were an operator on sets of formulas it would therefore be called 
to satisfy Toop’ (see KLM[85], p.l87). 

Although inhibition nets obey the very simple local activation rule 
stated in def.125, they may nevertheless be quite complex automata. This is 
indicated by the following two theorems: 



Theorem 134 (Boolean Mappings by Inhibition Nets) 

Every Boolean mapping f : {0, 1}^ — > {0, 1} may be computed by an 
FHIN in the sense that there are i (SnpuC) nodes ni, ... ,Ui and one (‘'out- 
put) node Uf in the network, s.t. for all {xi, . . . ,Xi) G {0, 1}\* if s"^{ni) — 
xi,... ,s*{ni) = Xi then Cl{s*){nf) = f{xi, . . .,Xi). 

Proof: 

A (sub)net containing the three nodes bias, n, n^, the excitatory edge 
{bias,n~), the inhibitory edge {n, {bias,nC)) and which contains no other edges 
computes the truth value of the ‘^negation^^ node ofn. A (sub)net containing 
the three nodes n\, U 2 , ny, the excitatory edges (ni,nv) and (n 2 ,nv) and which 
contains no other edges computes the truth value of the ^^disjunction’’ ny of ni 
and ri 2 (see fig. 3). Here ^computation’ is understood again in the following 
sense: if 5* is given by a classical evaluation of n, or ni and n 2 , then Fs* 
applied iteratively will obviously turn s* into a stable parameter- setting in which 
n~, fires iffn does not fire, and in which ny fires iffni fires orn 2 fires. But since 
every propositional formula with i propositional variables is logically equivalent 
to a formula which has the same set of propositional variables but in which 
only the negation and the disjunction sign are used as logical connectives, every 
Boolean function f may be computed by composition of subnets isomorphic to 
those sketched before. ■ 
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Fig. 3: Boolean Mappings by Inhibition Nets 

Theorem 135 (Finite State Machines by Inhibition Nets) 

1. Every finite inhibition net is a finite state machine 

2. every finite state machine (FSM) may be simulated by a finite inhibition 
net (FIN) in the following sense: for every internal state q of the FSM 
there is a corresponding (“internal state’') node Uq in the FIN; for every 
input i of the FSM there is a corresponding (“input”) node ni in the FIN; 
for every output o of the FSM there is a corresponding (“output”) node 
Uo in the FIN; if A is the next-state function of the FSM, and 5 is the 
next- output function of the FSM, there is a fixed natural number c, s.t. for 
arbitrary internal states q and inputs i of the FSM, it holds: \{q, i) = q' 
and 6{q,i) = o iff F^^,^{{ng}) = {n,/,rio}. 

Proof: 

The first claim is a direct consequence of the definition of inhibition 
nets and the definition of finite state machines (see e.g. Arbib[7], p.8). The 
set of inputs and the set of outputs of a finite state machine may simply be 
identified with the parameter- setting space S^^^. 

The second claim is proved in the same way as it is shown that a 
McCulloch- Pitts net may simulate every finite state machine (see Rojas[135], 
PP-4V)' The only adaptation to be made is that McCulloch- Pitts nodes with 
threshold 2 have to be replaced by “conjunction nodes” - this is possible ac- 
cording to theorem 134- c depends on the depth of the subnet that is used to 
compute Boolean conjunction. Where only input lines and output lines are used 
in [135], we add the corresponding input nodes and output nodes. Of course, 
the representation of a finite state machine by a McCulloch- Pitts net or by an 
inhibition net is generally a notoriously inefficient one. ■ 

Note that not every finite state machine can be simulated by an FHIN, 
since the latter are hierarchical, and thus they generally cannot simulate FSMs 
where the next-state function leads to loops in the internal state transition. 

As a final remark on the dynamics of inhibition nets we want to jus- 
tify the restriction to hierarchical inhibition nets by the simple fact that non- 
hierarchical inhibition nets violate the stability property (and therefore also 
the convergence property): 
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Lemma 136 

1. For some non- hierarchical nets X — {N^E.I^hias) there are inputs s* , 
s.t. there is no stable parameter- setting under s* 

2. for some non-hierarchical nets X = {N, /, bias) there are inputs s* , s.t. 

there is more than one stable parameter- setting under s* . 

Proof: 

1. E.g. let N — {bias^m}, s.t. E = {(6za5,m)} and I = {{m, {bias,m))}; 
let s* be the parameter- setting in which only the bias fires. Then it is easy 
to see that there is no parameter- setting ofX= (N^E^I^bias) stable 
under input s* . Call m in X the ^Tnhibition LiaF\ 

2. E.g. let N = {bias,n}, s.t. E = {(n, n)} and I = 0; let s* again be 

the parameter- setting in which only the bias fires. Then there are distinct 
parameter- settings s^'^^ of X = {N,E,I,bias) stable under input s* , 

s.t. s^'^{bias) = S 2 ^{bias) = 1, = 1, °(n) = 0. Call n in X the 

^Inhibition Truthteller” . ■ 

Lemma 136 does not imply that non-hierarchical inhibition nets are 
“defective” in any way. It just shows that they induce activity patterns in 
far more complicated ways than the hierarchical nets, since the latter exclude 
feedback. This gain of simplicity is one reason why we restrict ourselves to the 
study of FHINs in this part IV. A second reason is that it is not so clear what 
we should count as the “answer” of a nonhierarchical net to an input since there 
is generally no single stable parameter-setting or no stable parameter-setting 
at all. But in the appendix we are going to consider more general inhibition 
nets than FHINs, as well as more specific ones, i.e., subclasses of the class of 
FHINs. 




Chapter 15 

INTERPRETED INHIBITION NET AGENTS 



We have advertized inhibition nets as kinds of formal imitations of biological 
“brainware” , though on a very high level of abstraction. But brains are parts 
of cognitive agents. In this section we are looking for an analogous assessment 
of inhibition nets as being parts of cognitive architectures, s.t., their states and 
processes are interpreted as cognitive states and cognitive processes. In section 
15.1 we will assume the central system of our agent A to be an inhibition net; in 
section 15.2, we will strenghten this assumption when we assume that A is an 
interpreted network agent in order to ascribe occurrent perceptual and central 
state beliefs to A; finally, we will ascribe general dispositional beliefs as well as 
nonmonotonic inferences to the network agent A in section 15.3 and 15.4. In 
each of these sections we are going to presuppose the definitions, assumptions, 
and results from parts I and II, and the first chapter of the appendix. But 
chapter 16 can be read and understood also on the sole basis of the definitions 
and results that are stated in part III, in chapter 14 of part IV, and in this 
chapter. 



15.1 Inhibition Nets as Central Systems of Cognitive Agents 

In chapter 2 of part I, chapter 3 of part II (see also chapter 21 for a more detailed 
account of the following formalism, and also for some motivating examples) we 
have assumed our cognitive agent A to be a system of parameters evolving 
in discrete time steps t = 0,1,2,..., s.t. A consists of a perceptual system, 
a central system, and an action system. A’s central system consists of two 
subsystems: the occurrent central system of the central state parameters that 
normally change fastly and abruptly, and the dispositional central system of the 
central state parameters that normally change slowly and gradually if changing 
at all. We have associated with every parameter-setting of the dispositional 
central system the way (i) in which a parameter-setting of the perceptual system 
together with a parameter-setting of the occurrent central system leads to a 
new parameter-setting of the central system under the condition that is the 
current parameter-setting of the dispositional central system, and the way (ii) in 
which a parameter-setting of the perceptual system together with a parameter- 
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setting of the occurrent central system leads to a new parameter-setting of 
the action system, again given that is the current parameter-setting of the 
dispositional central system:* 

Definition 137 (Systems) 

By a system Sys we mean a tuple (^S'^ ^ ^ , nc, na) , s.t. 

1. 0 is the set of possible parameter- settings s^ G of the perceptual 
system of A 

2. 5^’^ ^ 0 is the set of possible parameter- settings G 5^’^ of the occur- 
rent central system of A 

3. ^ 0 is the set of possible parameter- settings G 5^’^ of the dispo- 
sitional central system of A 

f. X is the set of possible parameter- settings 5^ = G 

of the central system of A 

5. ^ 0 is the set of possible parameter-settings s^ G of the action 
system of A 

6. nc : 5^’^ -^ {/ 1 / : X 5^’^ -4 }, s.t 

nc(5^’^) is a mapping from x to S^, and nc(s^’^)(s^, is the 

next (i.e., at time t + 1) parameter- setting of the central system of A, if 
s^ has been the previous parameter- setting of the perceptual system of A, 
and ifs^'^ has been the previous parameter- setting of the occurrent central 
system of A (^previous’ here means: at time t); nc{s^A^ is associated with 

sc,d 

7. na:S^ {/ 1/ : 5^ x 5^’^ ^ }, s.t 

na{s^'^) is a mapping from x to S^, and na(5‘^’^)(s^, is the 

next (i.e., at time t 1) parameter- setting of the action system of A, if 
s^ has been the previous parameter- setting of the perceptual system of A, 
and ifs^'^ has been the previous parameter- setting of the occurrent central 
system of A ('previous' means again: at time t); na(s^’^) is associated 
with s^A ^ 

The set S of parameter-settings of a system Sys is the set of all 
s = nc(s^’^), na(s^’^)), s.t. G 5^, G G 5^’^, 

s^ G We will refer to S also as the ‘parameter-setting space’ of Sys. 

• (Assumption on A’s Being a System) 

A is a system as defined above. 

*This is worked out more elaborately in the appendix, section 21.1. 
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We are going to use the phrases ‘the system and ‘the system of A’ 
interchangeably. 



We have furthermore assumed that A’s perceptual system receives ex- 
ternal inputs, and that every such input to the perceptual system determines 
precisely one parameter-setting of A’s perceptual system - this is the external 
causal dynamics of the system A. The internal causal dynamics of the system A 
is the evolution of central state parameter-settings determined by the current 
parameter-setting of the perceptual system and the current parameter-setting 
of the central system. The external and the internal dynamics together may be 
represented by a state-transition mapping : 5 ^ 5, where S is the set 



of all parameter-settings s = 
the parameter-setting at time t: 



c.o c,d 

1 



old^ ^old’> ^old'> ^old 



, nc{s 



ifsis 









.^c(4^^),na(s=g^)) where 



1. is the parameter-setting of the perceptual system at t + 1 is 

determined “from outside” ) 

2- = nc{slfysli^, slQ is the parameter-setting of the central 

system at t + 1 

^new — parameter-settiug of the action system 

at t + 1 

4. nc{s^f^) is the mapping associated with which defines the next cen- 
tral state-parameter-setting under a new parameter-setting of the per- 
ceptual system and the new parameter-setting of the occurrent central 
system 

5. na(s^’g^) is the mapping associated with which defines the next 

action-parameter-setting under a new parameter-setting of the perceptual 
system and the new parameter-setting of the occurrent central system. 

FgP^^ is the state-transition function of the system (5^, 5^’®, 5^, 

, nc, na) of A given relative to the next parameter-setting of the perceptual 
system, where is the only parameter which is not determined by the pre- 
vious parameter-setting s of A, since s^g^ is determined by the causal infiuence 
of the environment on the system A. Note that we use the same metalinguis- 
tic function sign ‘F’ to denote the state-transition function for systems that 
we have also used in order to denote the state-transition function for inhibi- 
tion nets in the last chapter - what is meant should always be clear from the 
context. 



Let us now assume that the central system of A, which has been sup- 
posed to be responsible for A’s inferential activities, is a finite hierarchical 
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inhibition net with a fixed topology of connections. The dynamics of A's cen- 
tral system is identified with the dynamics of finite hierarchical inhibition nets 
that we have defined in the last chapter.^ Moreover, let us assume that A’s 
perceptual system also consists of a set of nodes but without any connections 
between the nodes, s.t. there is a bijective^ mapping g from the set of nodes of 
the perceptual subsystem to the set of nodes (except for the bias node) of the 
central subsystem, and there is a connection from every node of the percep- 
tual subsystem to its “image node” as being given by g] but we do not count 
these connections as belonging to A’s central inhibition network. The percep- 
tual system, which we will not describe in more detail, is assumed to generate 
a pattern of activity distributed over its perceptual system units, where the 
pattern is determined by the input from the “outside”. This pattern is trans- 
mitted via the edges that connect the perceptual and the central subsystem 
to the nodes of the latter, and thus the central system is fed by an input pat- 
tern from the perceptual system. The dispositional central system remains in a 
fixed parameter-setting, since the connections in the central inhibition net have 
been regarded as being fixed (and there are also no weights to be changed). 
The action system of A is again not focused on, but the parameter-settings 
of A’s action system are considered to evolve in parallel to the evolution of 
the parameter-settings of A’s central system, and each parameter-setting of 
A’s action system may be thought of as causing some action, or a state of 
inactivity. 

So we have: 

Definition 138 (Systems Having an Inhibition Net as Their Central Subsys- 
tem) 

Let Sys = (5^, 5^’®, 5^’^, 5*^, 5^, nc, na) be a system as above; let X = 

{N, £*, /, bias) be an inhibition net. 

We say that Sys has X under g as its central subsystem iff 

g is a mapping from a set of nodes with D N = 0 to the set 
N\{bias} , 

s.t. g is one-to-one and onto (entailing that and N\{bias} are of 
the same cardinality), and 

1, SP = : NP {0,1}} 

2. 5^’^ = {s^^^ 1 5^’® : iV — > (0, 1} with s^’^{bias) = 1 } the space of parameter- 
settings of the net X as already referred to in the last chapter 

Tn example 6 of chapter 2 we have already indicated that A might (to some extent) be a 
neural-like network. 

^This condition could also be relaxed to mere injectivity. 
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3. = {{E^I)} is a singleton consisting of the network topology of X; 

since we regard the latter in the following as being fixed, i.e., as unal- 
terable by e.g. some learning mechanism, we do not have to consider 
any parameter- setting for A’s central dispositional system apart from 

f. X i.e., every s^ = G is actually completely 

determined by s^^^ for our assumption on might even be identi- 

fied” up to isomorphy with 

5. nc(s^’^) is determined by the unique parameter- setting {E, I) of the central 
dispositional system as follows: for every s^ G S^, G 

nc{s^^^){s^ where s^’^ = Fg(^spfis'^^^), and Eg(^sP) i'^ 
defined in the last chapter on the dynamics of inhibition nets 

(i.e.: the next parameter- setting of A’s central system is a pair 
consisting of the fixed parameter- setting = {E, I) of A ’s central dis- 

positional system - the network connections of X have been assumed to 
be constant - and a “new” parameter- setting 5 ^’^ of A ’s central occur- 
rent system, s.t. arises from the “old” parameter-setting s^^^ of A^s 
central occurrent system, under the dynamics ofX under the input g{s^); 
g{s^) is the pattern of activity that is transferred from A ’s perceptual sys- 
tem. Note that here we identify again the parameter- setting s^, which is 
a mapping, with the set of, in this case, perceptual units that are active 
according to s^; g{s'^) is the image set of the “set” s^ under g, i.e., the set 
of g -images of arbitrary “members” ofs^; in turn, g{s^), considered as 
a characteristic function, is a parameter- setting of A ’s occurrent central 
system. F has been defined in the last section). 

The set is specified uniquely for a system (5^, 5^’^, nc, 

na), and we may therefore speak of “the” set of nodes of the perceptual 
subsystem associated with a given system ,nc,na) that 

has an inhibition net X under g as its central subsystem. Moreover, given X, g, 
there is a unique system Sys, s.t. Sys has X under g as its central subsystem, 
except for the selection of 5“ and na, which are not relevant in the present 
context. 

• (Assumptions on A’s Central Subsystem Having a Network Architecture) 

There is an FHIN X = (AT, E, /, bias), and a mapping g, s.t. (the system) 
A has X under g as its central subsystem. 

§In example 14 of chapter 3 we have already hinted at the distinction of node activities 
in a network on the one hand, and the topology of the network on the other, in terms of the 
parameter-settings of the occurrent central subsystem of a connectionist network on the one 
hand, and the parameter-settings of the dispositional central subsystem of a connectionist 
network on the other. 
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The change of patterns of activity in the network J now corresponds 
to the change of the parameter-settings of the central occurrent subsystem of 
A under both the internal dynamics of the network and the external dynamics 
determined by the input from the perceptual subsystem of A. We will keep 
speaking sometimes of G and G 5"^’® as sets {n G |5^(n) = 1 } and 
{n G N \s{n) = 1}, rather than as the characteristic functions of those sets. 
Conversely, we will continue speaking of subsets of or N in terms of their 
characteristic functions, i.e., as if they were parameter-settings contained in 
or The states of A are of course still defined as subsets of S; one state is still 
defined as the substate of another if the latter is a subset of the former (compare 
the terminology introduced in chapter 3, and further outlines in chapter 21). We 
say that a pattern X C is active ins = (s^, 5®, nc(s^’^), na(s^’^)) G 

S, if A C accordingly, we say that a pattern Y C N is active in 5, if T C 5 ^’^. 

The dynamics of a system Sys which has an inhibition net J as its 
central subsystem is thus defined as such: i^ s == 

is the parameter-setting at time t, then FgP^^{s) := {s^ew^ ^new^ ^new^ 
^new^'^^{^new)y'^^{^new)) parameter-setting at time t + 1, where 

^new ‘ * PP* S 

bition nets as defined in the last chapter (for an input g{s^i^) C N to the 
network) . 

Let us now (and in the following) assume that the parameter-setting 
of A’s perceptual system remains constant (say, identical to a fixed s^) for a 
sufficiently large amount of time. By our assumptions on A’s having an FHIN 
as its central component system, and by the results of the last section, it follows 
that there is a unique stable parameter-setting s = ( 5 ^, 5 ^’^, 5 “, nc(5^’^), 

na{s^^^)) G 5 in such a case, and that 5 ^’^ == Cl{g{sP)), i.e., is the 
unique stable parameter-setting for X under the constant input g{s^) to the 
central system of A, where g{sP) is determined by the constant parameter- 
setting of the perceptual system of A. Moreover, by theorem 130: for all 
parameter-settings s = s^f^, s^i^, nc{slf^^ € S, ifF*.(s) := 

Snew^s'^nL^ na(s^g^)), where k is the number of layers of the 

FHIN I, then = C/(5 (^p)). 

15.2 Singular Occurrent Beliefs in Net Agents 

Next we are going to ascribe singular occurrent beliefs to our inhibition net 
agent A by interpreting A’s states in an adequate way. The singular occurrent 
beliefs that we ascribe to inhibition nets are identified with causally active 
patterns of excitation - this corresponds to our explication of the notion of 
occurrent belief in section 3.3 and has already been hinted at by example 11. 
We may think of these beliefs as either being caused directly by a short-term 
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perception of the current state of the environment, or as being caused by a 
perceptual belief of the latter form either directly or via an intermediate in- 
ference process. In order to ascribe beliefs to inhibition nets we use again our 
factual language C (and V’ denotes as usual an arbitrary formula of C). When 
we have defined occurrent belief and total occurrent belief ascription, we will 
show that these definitions are adequate with respect to the characterizations 
of occurrent beliefs and total occurrent beliefs that we have given in part I. 

Parameter-settings and singular occurrent belief states are associated 
in the following way: 

Postulate for Belief States: 

A believes that [(p is true] in the parameter-setting s iff 
a particular set of nodes that is associated with p is active in s. 

Let Q be the pattern of activity associated with (/?, as far 

as ^’s perceptual system is concerned; let {(f) = {bias} U g{3^{(f)) C N 
be the pattern of activity associated with (f^ as far as A’s central system is 
concerned. If J = {N,E,I^bias) is the inhibition net considered, then 3^'^{(p) 
is thus a subset of the nodes of X, and we may always assume that bias G 3"^'^ {(f) 
since the bias node fires anyway. We say that 3^ and are interpretation 
mappings, 3^ {(f) and 3^^^ {(f) are the “neural” pattern interpretation of (f, and, 
conversely, (the proposition expressed by) f is the linguistic interpretation 
of 3^ {(f) and 3^^^ {(f). Whenever the pattern 3^{f) is active in a parameter- 
setting, the net has a perceptual belief the content of which is expressed by 
(f. Accordingly, whenever the pattern 3^'^ {(f) is active in a parameter-setting, 
the net has a central state belief the content of which is expressed by (/?. In the 
first case, only the perceptual component s^ of the “global” parameter-setting 
s = is relevant, whereas in the second case 

only the central component 5*^, or, more particularly, only the occurrent central 
component s^'^ of s is relevant. Since it is possible that 3^ {(f) = 3^{'ip) or 
3^^^{(f) = 3^^^{'ip) although ^ '0, one and the same pattern may be the 
“neural” interpretation of different sentences. Moreover, one and the same unit 
may be a component of many patterns which are in turn associated with many 
different formulas of C. This is a form of distributed representation, i.e. the kind 
of representation intended to be characteristic of connectionist approaches to 
cognition. 

The system architecture of A that we have described in the last section, 
and the conception of perceptual and of central state beliefs that we have 
presented now, correspond to our description of perceptual beliefs and of central 
state beliefs in section 3.1 in part I. 

Now let us determine this association of parameter-settings of inter- 
preted networks and belief states more precisely by means of some definitions: 
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Definition 139 (Interpreted Inhibition Network Agents) 

An (interpreted inhibition) network agent "VI is a quintuple {Sys^ T, p, 3^, 
3^'^), where 

1. Sys = is a system 

2. X = {N, E, I,bias) is an FHIN, s.t Cl{{bias}) ^ N 

3. Sys has X under g as its central subsystem, 

4 . 3^ \ C p{N^), where is the set of nodes of the perceptual subsystem 
of Sys 

5. : C p{N), s.t 3^^^((p) = {bias} U g{3^{p)). 

X together with is an interpreted inhibition network; a system Sys^ 
which has an inhibition net X under g as its central subsystem, together with 3^ 
and 3"^’^ is an interpreted inhibition network agent, or more shortly, a network 
agent. 3'^ and 3^^^ are the interpretations mappings associated with 31. 

We have added the constraint that Cl {{bias}) ^ N since otherwise 
for every parameter-setting 5 lemma 132 would entail that Cl{s) = N, since 
{bias} C s C N = Cl{{bias}). In this case the cognitive activity generated by 
an interpreted network would trivially converge to a stable state identical to 
A, which we want to exclude.^ 

A partially (interpreted inhibition) network agent is defined analo- 
gously to def.139 with the minor diff’erence that there are distinguished sub- 
sets W of I^N oi N, s.t. N = {bias} U g{NP), Cl{{bias})_^ iV, and 
3^ : C p{Np) {3^^^ is consequently a mapping from C to p{N)). In such 
a case the nodes contained in \ Np and in particular in AT \ iV may later be 
used as auxiliary “inter-neurons” without any representational function. 

So we can strengthen the assumption of the last section concerning the 
central subsystem of A: 

• (Assumptions on A’s Being an Interpreted Inhibition Net Agent) 

There is an interpreted inhibition network agent 31 = {Sys,X,g,3^,3^'^), 
s.t. the system A is identical to 91. 

Let us therefore refer to the network agent A in the subsequent sections 
and chapters of part IV by ‘3l\ 

In parts I-III the belief operators have only been indexed invisibly by 
some reference to the agent A; now we will make this reference explicit when 
we ascribe beliefs to interpreted network agents: 

^As we will see, for the class of net agents that we are going to define later, N is identical 
to the neural interpretation of the logical falsum, i.e. a net agent would always finally believe 
a contradiction if we allowed for Cl {{bias}) = N. 
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Definition 140 (Singular Occurrent Beliefs as Patterns of Activation in Net- 
work Agents) 

Let Vt = (5^5, J, be a network agent. 

Let s = be a parameter-setting in 

the parameter-setting space S of Sys: 

1. iff3n^)^sP 

(in words: 91 believes perceptually in s that [(p is true] iff the perceptual 
belief pattern associated with p is active in s^ ) 

2. s^^B^^^p) iffT^^{p)<^s^^- 

(in words: 91 believes occurrently centrally in s that [p is true] iff the 
central belief pattern associated with p is active in 

The ascription of occurrent (perceptual or central state) beliefs by 
means of patterns of activation matches our intuitive account of occurrent 
beliefs in part II, given that the usual additional assumptions are satisfied like 
the proper manifestation of belief states by means of the action system etc., 
which we will take for granted. 

Apart from belief simpliciter we can also introduce the concept of a 
total occurrent belief for interpreted networks: 

Definition 141 (Total Singular Occurrent Beliefs as Patterns of Activation 
in Network Agents) 

Let 91 = (52/s, X, ^,3^,3^’^) be a network agent. 

Let s = s®, nc(s^’^), be a parameter- setting in 

the parameter- setting space S of Sys: 

1. s^^AB^{p) iff3P{p) = sP 

(in words: all that 91 believes perceptually in s is that ]p is true] iff the 
perceptual belief pattern associated with p is identical to the set s^ of 
nodes that are active in the perceptual subsystem) 

2. s AB^'%p) iff3^^^{p) = s^’^ 

(in words: all that 91 believes occurrently centrally in s is that ]p is true] 
iff the central belief pattern associated with p is identical to the set s^'^ 
of nodes that are active in the central subsystem). 

Thus, by def.140 and def.141: 

Lemma 142 

Let 91 == (52/s, T, be a network agent, let s be a parameter- 

setting in the parameter- setting space S of Sys: 
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It follows for all (f E C that 

• ^f ABP{(f) then s 

• i/s t=^ then s B^^^{(p). 

Def.141 is furthermore justified by the following theorem: 

Theorem 143 

If 3^ and 3^^^ are onto, then def.lfl is an equivalent reformulation of 
our def.lO of total belief ascription in section 3.2, since it holds: 

1. s ABP(ip) iff 

(a) s\=^ BP if 

(b) for every jp G C: if s B^ip, then 01 perceptual belief that [ip is 
true] is a substate of3l’s perceptual belief that [(p is true], i.e., 

{s' e 5|s' boT BP if] C {s' G 5|s' Bp^P} 

2. s bori AB^^^{(p) iff 

(a) s B^^^p 

(b) for every 'ip G C: if s B^^^'ip, then 01 's central occurrent belief 
that P'lp is true] is a sub state of 01 ’s central occurrent belief that ]p 
is true], i.e., 

{s' G 5|s' C [s' eS\s' B^^^'iP). 

Proof: 

We will only show 1; the other claim is proved analogously. 

• from the left to the right: assume that s ABP{p); 

by def. 141 , 3P{p) = sP. Therefore also 3P{p) C sP, and so by def.lfl 
again, s\= BPp. 

Now let 'ip E C, s.t., s b^ BP'ip: def.lfl implies that 3P{'ip) C sP — 3P{p). 
If there was an s' G S, s.t. s' b^ BPp but s' BP%p, then by def.141 
£ s'P, 3P{'ip) 5 s'P, contradicting 3P{'ip) C 3P{p). 

• from the right to the left: assume that s b^ BPp, and for every ip ^ C: if 
s b^ BP^p, then {s' G S\s' b^ BPp} C {s' G S\s' b^ BP'ip}. 

Since s b^^ BPp, it follows from def.lfl that 3P{p) C sP. 

Now suppose for contradiction that 3P{p) ^ s^; because 3 p is onto (by 
assumption) , there is a 'ip £ C, s.t. 3P{'ip) = sP, and therefore s b^ BP^p. 
By the right-to-left assumption, it follows that {s' G S \s' b^ BPp} C 
{s' G S |s' b^ BP'ip}, thus there is an s' , s.t. s'p = 3P{p) and s' b^ BP'ip, 
and so 3P{'ip) C 3P{p) by def. 141 , 
contradicting 3P{p) ^ sP = 3P{'ip). ■ 
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Theorem 143 shows that, if 3^ and 3^'^ are onto, our ascription of 
total occurrent belief states in terms of patterns of activitation is adequate for 
the ascription of beliefs that we have suggested above, since our explication of 
the notion of total belief in def.lO of part II is satisfied (as well as the more 
detailed explication in the appendix, section 21.4). If 3^ and are not onto, 
then at least the direction from the left to the right (see the proof above) is 
still satisfied, and def.141 might still be considered as a kind of approximation 
to def.lO. Each of the subsequent results is still going to hold if 3^ and 3^'^ 
are not onto. But let us suppose in the following that 3^ and 3^^^ are indeed 
onto for the sake of theorem 143. The net agents defined within the proof of 
our completeness theorem below also have this property. 

A further assumption which is - though implicitly - contained in the 
definitions above is that for every network agent 51, and for every (p E C 
there are a patterns 3^{p) and J^’^((^), s.t. 51 perceptually believes that [p is 
true] if the first pattern is active, and 51 occurrently centrally believes that 
[p is true] if the second pattern is active (and similarly for the corresponding 
total beliefs). Thus, for every p E C there are network states in which the 
proposition expressed by p is believed by the agent, despite the fact that C 
is infinite. We think that this is acceptable since C is at least logically finite, 
and therefore there are also only finitely many propositions (sets of worlds) 
that are expressed by sentences of C. However, if someone still liked to restrict 
and to a proper subset of £, this would not cause any problems for 
our account, and the results of the subsequent chapters would still be valid. 
We would simply have to regard 3^ and 3^'^ as partial mappings, and every 
relevant claim in the subsequent section would have to be supplemented by the 
qualification ‘if 3^ j3^'^ is defined’. It is just in order to keep our terminology 
as simple as possible that we stick to interpretation functions that are defined 
on C universally. 

15.3 General Defeasible Dispos. Beliefs and Nonmon. Inferences in 

Net Agents 

Let us turn to general belief ascription and to the dynamic aspects of cognition 
in network agents. At some time t an interpreted network agent 51 may exhibit 
certain inferential activities. We focus on those types of inferences (i) that are 
initiated by an input to the central system from the perceptual system, s.t. this 
input corresponds to a total perceptual (and thus occurrent) belief of 51, and (ii) 
that have a final equilibrium state, in which 51 occurrently centrally believes 
the conclusion of the inference. The final state of an inference is therefore 
regarded to be identical to the closure of its input (compare section 14.3), i.e. 
the stable state into which the current net state is transformed under the given 
input according to the dynamics of the network. We may think of the closure 
state as a plausible hypothesis generated by the agent in light of the evidence 
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given by the input One of the peculiarities of this approach to inferences by 
network agents is that - contrary to the typical symbolic implementations of 
inferences - every inference turns out to be direct (compare chapter 4), i.e., 
it does not make sense to speak of an indirect inference as being composed of 
several direct subinferences, because every inference is regarded to terminate 
with the stable closure parameter-setting that is associated with the input. For 
that reason we can also identify the constant temporal period A^c that we have 
referred to in part I and part II as being the duration of a direct causation, 
of direct sustaining, and of a direct inference step, with the maximum time 
that it takes an inhibition net to settle into a stable state given an arbitrary 
input. According to theorem 130, and according to the remark we have made 
subsequently to this theorem. Ate niay be set to the number of layers of the 
inhibition net of 01, or, more properly, the number of layers times the temporal 
unit period. 

As we have pointed out in chapter 4, inferences are based on general 
beliefs, and the presence of general beliefs entails the presence of inferential dis- 
positional states. We will only deal with general defeasible beliefs in this section, 
i.e., beliefs the contents of which are expressed by defeasible conditionals. In 
the next section we turn to strict general beliefs. 

In the following we will identify general defeasible beliefs with non- 
monotonic inference dispositions. We have argued in chapter 3 that general 
beliefs are not necessarily identical to inferential dispositions, but that they 
only necessarily entail such dispositions. But since we will only ascribe one 
type of defeasible general beliefs to our net agents, corresponding to just one 
defeasible implication sign an identification is unproblematic. Thus we as- 
cribe the general defeasible belief that [o;[x] j3[x\ is true] to a network agent 

^ if and only if the latter is disposed to draw a nonmonotonic inference from 
a[a] to P[a] of the form that we have just described, i.e., if and only if: if is 
the current parameter-setting of the perceptual system, s.t. all that 0^ believes 
perceptually in is that [a[a] is true], the activation pattern corresponding to 
is transmitted to the central subsystem as a constant input s* = g{sP), and 
the inhibition network state converges under this constant input to the stable 
parameter-setting Cl{s*) in which 04 occurrently centrally believes that [p[a] is 
true] . Put shortly we might say: 04 dispositionally (centrally) defeasibly believes 
in s that [a[x] ^ /3[x] is true] iff 04 is disposed in s to draw the nonmonotonic 
inference from the total perceptual belief of a[a] to the final central belief of 
/3[a\. 

Since the dispositional central subsystem (i.e., the network topology) 
of 04 is held constant, since the central system of 04 has been assumed to be a 
finite hierarchical inhibition net, and since we have shown in the last chapter 



II This account of inference in network agents closely parallels our example 31 in chapter 4, 
where we have considered a possible implementation of inferences in connectionist networks. 
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that the parameter-settings of FHINs always converge under a constant input 
s* to a parameter-setting Cl{s*) which only depends on the given input s* and 
not on the selection of an initial state, it will follow: there is a parameter-setting 
5, s.t. 0^ believes in s that [a[x] ^ P[x] is true] iff for all parameter-setting s it 
is the case that VI believes in s that [a[x] /3[x] is true]. 0^’s general beliefs are 

thus determined completely by the pattern of connectivity within the inhibition 
net of In the next chapter we will show that the so-defined general beliefs 
of ^ are always closed under the rules of the system CL of nonmonotonic logic 
for normic conditionals which we have discussed in part III. Although we leave 
open what kind of defeasible conditional actually is, this result concerning 
our “psycho-” semantics for network agents and the system CL indicates that 
=> might be identified with the normic implication sign =>nor- 

Let us state this now in more formal terms. We use again as the 
language of belief ascription: 

Definition 144 ( General Defeasible Dispositional Central Beliefs as Nonmono- 
tonic Inference Dispositions Implemented by Patterns of Connectivity in Net- 
work Agents) 

Let ^ = {Sys^X, be a network agent. 

Let s — ( 5 ^, 5^’^, 5®, nc(5*^’^), na(s^’^)) be a parameter- setting (in 

the parameter- setting space S of Sys): 

We say that 

s B^{a[x] => f3[x]) 



for all parameter- settings sq = ^Sq, Sq sg, nc(5Q ^), na(sQ , for 

all parameter- settings Si = s^'^ , sl'^ , Si,nc{si^),na{si^)^ with s^ = Sq 

and 5 ^’^ = Cl{g{s^)): 

'if So ABP{a[a\), then si 1=^ 5^’^(^[a]).** 

By def.140 and 141 from the above, the last definition has some equiv- 
alent but more transparent (re-) formulations: 

Remark 145 



• s B^{a[x] => j3[x\) iff 

• a^’“(/3[a]) C C/(3(3naN))) iff 

• J^’°(/3[a]) C C'/(3‘=’°(a[a])). 



**Note that only the first two entries of the parameter-settings sq and si are actually 
relevant in this context. Furthermore, = {E, I) by our assumptions from 

above. 
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General beliefs are thus not represented in the network by patterns of 
activity but by the topology of the network. Such a way of coding is again a 
distributed kind of representation.^"*^ 

A clause similar to the last one of remark 145 is used by Gardenfors[57], 
p.63, in order to introduce nonmonotonic inferences to neural networks. The 
only minor difference is that Gardenfors does not interpret object languages by 
patterns, but instead he talks about the patterns in the metalanguage without 
making use of an object language at all. The last clause of remark 145 is going 
to be the one that we use in section 16.2, and also in the chapters 24-27 of the 
appendix, when we show our representation theorems. 

As already indicated, lemma 128 of the last chapter directly entails: 

Corollary 146 

The following two claims are equivalent: 

1. there is an s E S, s.t s B^{a[x] p[x]) 

2. for all s E S: s B^{a[x] /3[x]). 

By the last corollary, we can simply say that 91 N B^{a[x] => P[x]), or, 
even more shortly, that N o[x] => P[x] iff there is an 5 G S' (for all s E S): 
s B^{a[x] => P[x]). 

We can now show that the characteristic property of dispositional be- 
liefs which we have expressed on p.68 of chapter 4 (and which we have informally 
justified in chapter 3) is actually satisfied: 

Corollary 147 (Dispositional Beliefs and Dispositions to Change to and Re- 
main in Belief States) 

If s B^{a[x] => [3[x]) then s a[a] =^dis /?W 

(where we understand ^dis according to def.19, with the additional 
qualification that a[a] expresses the content of a total perceptual belief, whereas 
f3[a] expresses the content of an occurrent central state belief) 

Proof: 

• Recall from def.19 on p.65 that 
s Noa a[a] -^dis P[o] iff 

1. ^ is disposed in s to change (after the amount Ate of time) to the 
occurrent central belief that [P[a] is true] given the total perceptual 
belief that [a[a] is true] 

( and given that the perceptual input to 91 is constant for the amount 
Ate of time) 

example 12 of section 3.3 we have already referred to the topology of the connec- 
tions within a network, and to the assignment of weights to the connections, as the typical 
implementations of dispositional beliefs in the connectionist paradigm. 
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is disposed in s to remain (for the amount Ate of time) in the 
occurrent central belief that [P[a] is true] given the total perceptual 
belief that [a[a] is true] 

( and given that the perceptual input to ^ is constant for the amount Ate 
of time). 

• By def.187 on p.308 in the appendix we have: 

1. let 5*^’^ G let X, Y be states, i.e., X, Y C S: 

^ is disposed in s^'^ to change (after the amount Ate of time) to Y 
given X 

( and given that the perceptual input to ^ is constant for the amount 

Ate of time ) iff 

for all s e X, s.t. s = 

F,p(F,p(...(F,p(F,p(5)))...))ey 

Ate times 

2. let s = ( 5 ^, s", nc(s^’^), na(s'^’^)) G S, let X, Y be states, 

i.e., X,YCS: 

is disposed in s to change (after the amount Ate of time) to Y 
given X 

( and given that the perceptual input to ^ is constant for the amount 
Ate of time) iff 

is disposed in s^^^ to change (after the amount Ate of time) to Y 
given X 

( and given that the perceptual input to 01 is constant for the amount 
Ate of time ) 

3. let 5 ^’^ G , let X, Y be states, i.e., X, Y C S: 

01 is disposed in s^^^ to remain (for the amount Ate of time) in Y 
given X 

( and given that the perceptual input to 01 is constant for the amount 
Ate of time) iff 

for all s e X C\Y, s.t. s — (s^ , s^^^ , s^'^ , s^ , nc{s^'^) , na{s^^^)) : 
there is an amount At of time, s.t., after At state-transitions, the 
system remains for (at least) Ate state-transitions within Y , i.e. 
Fs4Fsp{Fsp{. . . {Fs.{Fsp{s ))) . . .))) e Y, 

" V ' 

/\ f tzTriGS 

Fs4FsAFspiFsp{. ■ . iF,p{Fsp{s))) . . .)))) G F, 

At times 
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F,p(F,p(. . . (F,p(F,p(F,p(F,p(. . . (F,p(F,p(s))) . . .))))) • • •)) € F 

' V ' ' V ' 

Ate times At times 

I let s = (sP,s=’°,s‘=>‘',s“,nc(s‘=’‘^),na(s'='‘^)) € S, let X, Y be states, 
i.e., X,YC S: 

^ is disposed in s to remain (for the amount Ate of time) in Y 
given X 

(and given that the perceptual input to 01 is constant for the amount 
Ate of time) iff 

0^ is disposed in s^'^ to remain (for the amount Ate of time) in Y 
given X 

(and given that the perceptual input to OT is constant for the amount 
Ate of time). 

• Now suppose thats 1=^ B"^{a[x\ => 0[x]),fors = s'', 

na{s^^^)) e S: 

let X be^’s total perceptual belief that [a[a] is true], i.e., X = {s' e S\ 
s' AB^a[a])}; 

let Y be ^’s occurrent central state belief that [/3[a] is true], i.e., Y = 
{s'g 5 |s' 5 ^’"(/ 3 [a])}; 



• for all Sold e X, s.t. Soid = 



_ (jC,d . 

^old — ^ • 

= i^neun^'nen,, 

^ V ' 

Ate times 

nc(s^ew)^na{s^^J), 

then, according to def.138, 

^new — ^9{slid)(^9{s^oid)^' * ' ^^9{s^id)^^9{sli^){^old))) ‘ * O) “ ? 

s ^ 

Ate times 

because of theorem 130, 



f.c,d a 
^new 5 ^new ’ 



= butCl{g{sli^)) e F by Sold 1=^ ABP{a[a\), s B^{a[x] => 

(3[x]), def. 141 , and remark 145. 



• Therefore, we have: 

or is disposed in to change (after the amount Ate of time) to Y given 
X 

( given that the perceptual input to 01 is constant for the amount Ate of 
time), 

and thus also: 
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^ is disposed in s to change (after the amount Ate of time) to the belief 
that [(3[a\ is true] given the total belief that [a[a] is true] 

( and given that the perceptual input to VI is constant for the amount Ate 
of time). 

• Analogously, we can show: 

^ is disposed in s^'^ to remain (for the amount Ate of time) in Y given 
X 

( and given that the perceptual input to is constant for the amount Ate 
of time), 

and thus also: 

is disposed in s to remain (for the amount Ate of time) in the belief 
that ]/3[a] is true] given the total belief that ]a[a] is true] 

( and given that the perceptual input to is constant for the amount Ate 
of time). 

• Summing up, it follows as claimed above: 
s l=(yi a[a] ^dis I3[a]. ■ 



Corollary 147 enables us to put the formal machinery that we have 
developed in chapter 4 to work. 

Let a trajectory (of parameter-settings) of be any sequence {s{t))^^j^ 
for In = {0, .. . ,n}, s.t. s(0) G S arbitrary, and there is a parameter-setting 
sP ^ of the perceptual subsystem, s.t. for all n G /n with n > 0: 5(n) = 

Fsp{s{n- 1 )); 

E.g., corollary 147 now implies by corollary 21 of p.67 in chapter 4: 

Corollary 148 

For all trajectories {s{t))^^j^ of^: 
if s{t) N a ^dis (3 then: 

• if s{t) N AB^{a[a\), then 

1. t-\- Ate, (5(0)te/n Causes{tbP{a[a\), 6^’°(/3[a])) 
s{t-^Ate)\=B^^^{(3[a]) 

• if s{t) N AB^{a[a\) and s{t) N B^'^{l3[a\), then s{t) N Sustains{tb^{a[a]), 

use the notation introduced in parts I, II, and the first chapter of the appendix 
{Causes, Sustains, tb^, tbP has been used to denote total belief states, 6*^’° has been 

used to denote belief states. 
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Finally, we can re-introduce def.22 as a definition of defeasible-reason- 
for and nonmonotonic inference ascription. As we have emphasized above, our 
distinction between direct and indirect reasons/inferences collapses for the class 
of network agents that we consider. Moreover, let us restrict ourselves to tra- 
jectories {s{t — Ate), • • • where Ate is again the time that it takes the 

inhibition net T of OT at maximum to settle into a stable parameter- set ting. 

So we can simply define {DR is the defeasible-reason ascription oper- 
ator introduced in chapter 4): 

Definition 149 (Reason- for Ascription and Nonmonotonic Inference Ascrip- 
tion for Network Agents) 

Let {s{t — Ate), • • • 7 <5(t)) be a trajectory of parameter- settings for 
from time t — Ate to time t: 

1. {s{t - Ate), ■■■, s{t)) N DR{ABP{a[a\), B'=’°(/3[a])) iff 

(a) s{t — Ate) N B'^(a[x\ =» /?[x]) 

(b) s{t — Ate) 1= ABP{a\a\) 

2. {s{t - Ate), ■■■, s{t)) N a[a] /3[a] iff 

(a) s{t) t= DR{ABP{a[a]),B'^^°{(5[a])) 

(h) s{t-Ate)^B<^^ffl3[a\)). 

This definition of nonmonotonic inference for network agents, together 
with the definitions of singular occurrent beliefs and of general dispositional be- 
liefs for network agents above, corresponds precisely to our explication of the 
notion of nonmonotonic inference in chapter 4, and our (partial) explication 
of the notion of belief in chapter 3. Thus - given that an interpreted network 
agent’s perceptual and central subsystem is connected “adequately” to her ac- 
tion system, which we presuppose - there are good reasons to think that an 
interpreted network agent has singular occurrent perceptual beliefs, singular 
occurrent central state beliefs, general dispositional central state beliefs, and 
that she draws nonmonotonic inferences from total occurrent perceptual be- 
liefs to occurrent central state beliefs. Occurrent beliefs are implemented by 
patterns of activity, dispositional beliefs by the network structure. Nonmono- 
tonic inferences are implemented by state-transitions in an inhibition network. 

15.4 Universal Dispositional Beliefs in Net Agents 

Now we will complement our account of general defeasible beliefs in networks 
agents by an account of general strict beliefs. Contrary to its defeasible coun- 
terpart, we are not going to identify the strict general belief that [a[x] P[x] 
is true] with an inference disposition. But we define the ascription of strict 
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general beliefs in a way such that a monotonic inference disposition is entailed 
(just as demanded by our explication of the notion of general belief - see p.68 
in chapter 4). 

Let us restrict ourselves only to as the language of strict belief 
ascription, i.e., we restrict ourselves to universal belief ascription: 

Definition 150 (General Strict Dispositional Central Beliefs as States of Pat- 
tern Containment in Network Agents) 

Let = (5^5, he a network agent. 

Let s = na(s^’^)) be a parameter- setting in 

the parameter- setting space S of Sys: 

We say that 

s 1=5} B'^{a[x] I3[x]) 

iff3^'°{a[a\ /3[a\) = {bias} 

(iff3P{a[a\ /3[a]) = 0). 

This implies directly: 

Corollary 151 

The following two claims are equivalent: 

1. there is an s e S, s.t. s B^{a[x] — > (3[x]) 

2. for all s e S: s B^{a[x] (3[x]). 

By the last corollary, we can again simply say that 91 \= B^{a[x] 
/3[x]), or, more biefly, that 91 N o:[x] ^ (3[x] iff there is an s E S' (for all s G S): 
s B^{a[x] P[x]). 

We are going to study a class of interpreted inhibition net agents in the 
next chapter which have (among others) the following property: if 9^’® (a [a] ^ 
0\a]) = ibias}, then 9‘^’^(o:[al) 2 9^’^(/?[al); note that this property implies: if 
3P{a[a] p[a]) = 0, then 3^{a[a]) 2 3^{P[a\). 

Let us now show informally that under this assumption the univer- 
sal general belief that [a[x] P[x] is true] entails the disposition to draw a 
monotonic (or more particular: a deductive) inference from the perceptual be- 
lief that [a [a] is true] to the central belief that [P[a] is true]: if the network 
agent believes in s universally (centrally) that [a[x] — ^ j3[x] is true], and if the 
agent additionally believes in s perceptually that [a [a] is true], then (i) by the 
latter assumption, and by def.140, the pattern 3'^{a[a\) is active in s, (ii) by 
def.138 and 139, after one state transition the pattern 3^^^{a[a\) is active in 
Fsp(s), (hi) by the assumptions and by def.150, the pattern 9^’°(/3[a]) is also 
active in Fgp(s), therefore (iv) the agent centrally believes in Fgp(s) that [P[a] 
is true] , and by our assumptions on the network dynamics the agent adheres to 
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that belief until the stable state is reached. Summing up: if the network agent 
^ believes in s universally (centrally) that [a[x] /3[x] is true], then - given 

our net representation of universal beliefs and the special assumption sketched 
before - 01 is disposed to draw a monotonic inference from the perceptual belief 
that [a [a] is true] to the central belief that [P[a] is true]. 

This informal reasoning could be turned into a formal proof by using 
again the relevant definitions and results from part I and from the appendix. We 
could furthermore add analogous results as in the last section on general defea- 
sible beliefs, and we could finally restate the definition of monotonic inference 
for the present context. But since we want to concentrate just on nonmonotonic 
inferences for the rest of part IV, we leave it with these informal considerations 
which nevertheless indicate the adequacy of our definition of universal beliefs 
for network agents under the special assumptions from above. 

For the next chapter, it is useful to keep track of the following obvious 

remark: 

Remark 152 

Let ^ he a network agent 

Assume that for all a[a] — > P[a] G C^: 

if3^^^{a[a] P[a]) = {bias}, then 3^^^{a[a]) 3 3^^°{l3[a]). 

Then a[x] — > P[x] implies for all s £ S: 

1. if s B'P{a[a]), then s Bp{P[o\) 

2. if s l=gt B^^^{a[a\), then s B^^^{P[a]). 

In the next chapter we will see that, if 01 = {Sys,X,g,3^ ,3^'^) , s.t. 

and have certain representational properties, then (i) the set of con- 
ditionals expressing the contents of the general beliefs of 01 is closed under 
the rules of the system CL in part III, and (ii) for every consistent CL-theory 
TTi^ of conditionals a[x] => P[x] extending a deductively closed set of 

conditionals (p[x] 'iplx] (recall the definitions of chapter 10 in part III) there 
is an interpreted network agent (ii.i) having a set of general defeasible beliefs 
the contents of which are expressed by precisely the defeasible conditionals con- 
tained in TH=^, and (ii.ii) having a set of universal beliefs the contents of which 
are expressed by precisely the universal conditionals contained in T , where 
(iii) 3^ and 3^^^ have again those special representational properties. Result (i) 
will be called a ‘soundness’ result, result (ii) a ‘completeness’ result, and both 
together a ‘representation theorem’. 




Chapter 16 

CUMULATIVE-ORDERED INTERPRETED INK. NET 
AGENTS AND THE SYSTEM CL 



16.1 The Network Semantics for Cum.-Ordered Interpreted Inh. 
Net Agents 

We are now going to study a special subclass of interpreted inhibition net- 
work agents, which are characterized by the following constraints or postulates 
concerning (implying corresponding postulates on 3^ via g): 

Postulates for 3^^^: 

1. 3^^^{T) = {bias} (T is the logical verum), 

3^’®(_L) = AT (_L is the logical falsum) 

2. let rW 3 c,o - [if eC\3^^^{(p) = {bias}}: 

for all if.'ifeC: if ^ ip ^ ^ then 3^^^{p) 2 

3. for all p.ifeC: A^p) = T’^((/p) U 

4. for all p e C: bias G 3^^^{p). 

Note that it would suffice to postulate only [3': for all ^ G £: 
A'ljj) C 3^'^{p) U 3^'^{'ip)] in the light of \- p A'^) ^ p, TH^c.o h 

p A'lf and thus by 2, 3^^^{p A 2 3^'^{p) U 

The postulates are motivated by the following desiderata and consid- 
erations: 

1. should believe that [T is true] in every possible parameter-setting, and 
thus, by def.140, 3^'^{T) has to be a pattern which is active in every possi- 
ble parameter-setting; we choose {bias} to be this pattern. The postulate 
for J_ is a kind of “normalization” constraint - if one liked to get rid of 
it, one might simply replace W’ in the considerations below by ‘?(-L)’. 

2. TH 3 c,o is the set of formulas p, s.t. p is believed by the net in every 
possible parameter-setting (since 3"^’^{p) = {bias} C s for arbitrary s). 
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If TH' 3 c,o f- 99 i.e., if (/p ^ ^ is entailed deductively by T'H':ic,o 

in the standard sense, the net should also believe that [(f ^ ip is true] 
in every parameter-setting. Now suppose the net is in the parameter- 
setting {(f), i.e., all and only the nodes within 3^'^{(p) fire; in this 
case the net also believes that [(p is true], by def.140 again. But then, by 
modus ponens, the net agent should also believe that is true] in this 
case, which entails, according to the way in which we have associated net 
states with belief states, that must be a superset of 

We forgo to postulate also the direction from the right to the left, i.e. [if 
5 3^^^{'ip) then also TH 3 c,o \- p xp] since this will have some 
technical advantages concerning the proof of the representation theorem 
in the next section. If T'H 3 c,o [- p xp we say that p implies xp according 
to TH 3 c,o^ or that p is stronger than xp {xp is weaker than p) according 
to T 1 ~L 3 c,o. Analogously, if 3^^^{p) D 3^^^{xp) we say that p implies xp 
according to or that p is stronger than xp {xp is weaker than p) 
according to 3^^^. ‘stronger’ is not meant to entail being strictly stronger, 
and the like for ‘weaker’. 

3. at first glance it may seem strange that the interpretation of a conjunc- 
tion should be identical to the union of the component interpretations - 
generally, we are used to define it by the intersection of the component 
values. On the other hand, this postulate intuitively matches the inter- 
pretation of neurons as “elementary- feature detectors” : suppose there are 
just two neurons Ui and 77 , 2 ; ni fires iff a red object has been detected, 
whereas U 2 fires iff a large object has been detected. If now a both red 
and large object has been detected, this will be the case if and only if 
both ni and U 2 fire, i.e. the set of firing neurons will be identical to the 
union of {hias^ni} and {bias^n 2 } and not to their intersection. 

4. bias fires in every state anyway. 

Compare 1 and 2 to the following quotation of Rumelhart et al.[138], 
p.84, on distributed representations: “. . . the relation between a type and an 
instance can be implemented by the relationship between a set of units and a 
larger set that includes it. Notice that the more general the type, the smaller 
the set of units used to encode it. As the number of terms in an intensional 
description gets smaller, the corresponding extensional set gets larger.” Fur- 
thermore, compare 3 to the following quotation taken again from Rumelhart 
et al.[138], p.94: “A distributed representation uses a unit for a set of items, 
and it implicitly encodes a particular item as the intersection of the sets that 
correspond to the active units”; on p.95, such distributed representations are 
explicitly referred to as “conjunctive” . 

Although postulate 3 is supported by what we have just pointed out 
above, the restriction to mappings which satisfy this postulate is not as 
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“innocent” as it may seem: assume that in the example above there is an 
additional neuron ns which fires iff a red and large object has been observed. 
In that case, ns neither necessarily fires when a red object has been detected, 
nor when a large object has been detected. Thus ns should be a member of 
3^’^ {red A large) ^ but ns should not be a member of 3^'^ {red) U 3^’^ {large) ^ 
or so it seems. But such a situation is excluded by postulate 3. So we should 
better deal in more detail with the consequences of our postulates regarding 
the interpretation of the patterns {bias^n}^ which are the borderline cases 
of distributed representation (such patterns are sometimes even called ‘local’ 
although they should not be mixed up with the strictly local representation by 
means of nodes n: see van Gelder[174], p.236). 

A partial hint is given by postulate 2 : it says that the stronger a formula 
is (according to TH 3 c,o or 3^'^), the larger its interpretation; or, put inversely: 
the weaker a formula is, the smaller its interpretation. E.g. the falsum ± implies 
every formula a since TH^c,o h _L — > a for all o; G C. Correspondingly, its in- 
terpretation is the largest possible image under 3^'^ (which is by normalization 
identical to N). On the other hand, the verum T is implied by every formula 
a, since TH 3 c,o h a T for all a G >C, and correspondingly its interpretation 
is the smallest possible one (= {bias}). Now suppose 3^'^{(f) = {bias^n}: in 
such a case (p is strictly stronger than T according to TH 3 c,o and thus also 
according to since TH 3 c,o h ^ T, but TH 3 c,o F T ^ because oth- 
erwise also TH 3 c,o h (f and 3{(p) = {bias} {bias,n}; moreover, there is no 
s.t. is both strictly stronger than T and strictly weaker than (p according 
to TH 3 c,o.^ since in that case we would have {bias} = 3^'^{T) ^ ^ 

3^'^{p>) = { 6 ias,n}, which is impossible. Therefore the pattern {bias,n} cor- 
responds to a “minimal” state of belief having a content of minimal strength 
(a “microfeature” ) except for the beliefs the agent has in every possible state. 
All states of belief which have stronger contents than such minimal ones are 
superpositions of minimal belief patterns. E.g. if 3^^^{p) = {bias^ni}, and 
3^^^{xIj) = {6ia5, 712 }, then 3^'^ {(p A 'i/j) = { 6 ias,ni,n 2 } by postulate 3, and 
there is no pattern {bias, n} that has A '0 as its content (or rather what is 
expressed by (pA'ip). This is the kind of assumption that is contained implicitly 
in our postulates 1-4. 

The postulates concerning 3^'^ imply (since 3^^^{(p) = {bias}Ug{3^ {(p)) , 
for g bijective) according to def.139 the following corresponding postulates on 
3P: 



Postulates for 3^: 



1. JP(T) - 0, 
a^(_L) = NP. 
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2. Let rn^p = {^eC = 0 }: 

for all (f, e C: if TH^p (f 'ip then 0 

3. For all ip, ^ e C: 3^{p A = T{ip) U 



If we combine the notion of an interpreted inhibition network agent 
with the postulates from above, we get the notion of an interpreted cumulative- 
ordered inhibition network agent which is the cornerstone of all further inves- 
tigations in this part IV: 

Definition 153 (Cumulative-Ordered Interpreted Inhibition Net Agents) 

A cumulative- ordered (interpreted inhibition) network agent ^ is a 
quintuple {Sys,X, g,3^ , s.t 

1. 91 is an interpreted (inhibition network) agent (implying that X is an 
FHIN) 

2. 3^'^ and 3^ satisfy the postulates from above.* 

A cumulative-ordered partially interpreted (inhibition) network is de- 
fined analogously to def.153, but with 91 being a partially interpreted (inhibi- 
tion network) agent. It is easy to see that all of the results stated below would 
also turn out to be true if we decided to use cumulative-ordered partially in- 
terpreted networks instead of totally interpreted ones. Furthermore, each of 
the definitions below may also be stated for cumulative-ordered partially in- 
terpreted networks. In this chapter we opt for the latter just for the sake of 
simplicity. It is only in the chapters 18, 19, and in chapter 24 of the appendix 
that we will refer to partially interpreted network agent. 

Def.153 directly entails: 

Corollary 154 

• 3^^%if v^p)c T’«((^) n a^’^(^) 

(since TH3c,o h p ip V TH3c,o \- \p p \J 'ip , and postulate 2) 

• 3^^^{p)U3^'^{-^p) = 3^'^{p A-^p) = = N (postulates 3, 2, and 1) 

• {N\ 3^^^{p)) U {bias} C 3^'^{-^p) (because of the previous line) 

*The qualification ‘cumulative-ordered’ in ‘cumulative-ordered (interpreted inhibition) 
network agent’ will be justified later, when we state a soundness and completeness theorem 
for the system CL with respect to the “psycho” -semantics of cumulative-ordered net agents. 
Recall from part III that CL is also sound and complete with respect to the cumulative- 
ordered model semantics for normic conditionals. 
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• T'H':ic,o = {if ^ = {bias}}: (by postulates 2 and 1, and by 

{bias} 7 ^ N entailed by def.139) Tl~L 3 c,o is a consistent deductively closed 
set of factual formulas, i.e., a consistent theory in C; in particular: 

ifTH 3 c,o h (p then = [bias] 

• ifTHjc,o h -0 then (because of postulate 2). Thus, 

e,g. 

i^)) = A V^) 

• TH 3 c,o h (p[a\ xpla] iff3l\= p[x] '0[x] (by postulate 2 and def.150). 

From a neurophysiological point of view it may again seem strange 
to assume that the activity pattern associated with believing -u/? has to be 
a superset of the complement of the pattern associated with believing p, as 
it is stated by the third line of the last corollary. However, this fact is just 
another consequence of the assumption that beliefs are the superpositions of 
“minimal” beliefs. Of course we do not claim that this assumption holds for 
every single part of a biological or artificial brain, but it might nevertheless 
be an economical way of representation which could be employed in certain 
substructures: 

Example 155 

Let N = {5ms, ni, ri 2 , ns, 77 , 4 }, let C be regarded as a propositional lan- 
guage again, and let us just consider the propositional variables red and large; 
let us assume that 

3^^^ {{red A large) V {red A -^large) V (-red A large)) = [bias, n\}, 

3^^^{{red A large) V {red A -^large) V {-^red A ^large)) = [bias, U 2 }, 

3^^^{{red A large) V (-red A large) V (-red A ->large)) = [bias, n^}, 

3^'^ {{red A ^large) V (-red A large) V (-red A -^large)) — {bias, n^}. 

So e.g, n\ fires iff the agent believes that there is some object right 
in front of him which is either red and large, or red and not large, or not red 
and large. The only possibility excluded is the combination of being not red 
and not large. Analogously for the nodes n 2 ,ns,n 4 . The patterns {bias,Ui} for 
i = 1, 2, 3, 4 are the minimal patterns having minimal content except for the pat- 
tern only containing the bias node, which corresponds to the belief that there is 
anything there without any further qualification (thus this belief is always true). 
Intuitively, if a red object is detected, the nodes bias,ni,U 2 should fire simul- 
taneously, while if an object is detected which is not red, the nodes bias,n 2 ,n 4 
should fire. If 3^^^ is an interpretation mapping of a cumulative- ordered net- 
work agent, we can indeed derive that 3^^^ {red) = {bias,ni,n 2 }, 3^^^ {large) = 
{bias, ni, ns}, 3^^^{^red) = {bias, ns, U 4 }, 3^'^{~^large) = {5ias, n 2 , n 4 }. The 
latter patterns are superpositions of the minimal patterns given above. In this 
case the pattern associated with believing ^red is identical to complement of 
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the pattern associated with believing red, if we disregard the bias node, and 
analogously for -^large and large. 

Since the interpretation mappings 3^ and generally lack inductive 
clauses for negation and disjunction, not every such mapping may be defined 
by fixing the interpretation of the propositional variables and extending the 
interpretation recursively to the complex factual formulas. Thus one may ask 
how difficult it is in general to construct such interpretation mappings. The 
answer is that such a construction may be achieved easily by means of assigning 
sets of worlds to nodes. By a world we mean just as in part III a truth value 
assignment for the propositional variables in C (where C is again considered as 
a propositional language, just as in part III). 

The next definition shows the method of construction, which we outline 
just for 3^’^ (but which is similar in the case of 3^): 

Definition 156 (Interpretations by Labellings) 

Let W be the set of worlds for C, : C p{N) an interpretation 
for a cumulative- ordered inhibition network agent. 

Furthermore, let I : N \ {bias} p{W) \ {0} (a labelling function): 

1. let 3^’^ [1]: C ^ p{N) be defined by: 

3^'^ [1] {(p) = [bias] U {n E N \ {bias} \ not: n ^ (p} (where we use the 
satisfaction relation ^ defined in section 9.3 for states in cumulative 
models relative to a labelling 1; thus the satisfaction relation is actually 
dependent on the choice of I, i.e., ^ = ^i) 

2. let I p^’^] : N \ {bias} p{W) \ {0} be defined by 

I p^’^] (n) = {w eW \yp> E C: n ^ 3^'^{p>) —^w\=p}. 

I p^’°] (n) cannot be empty since {p E C\n ^ } is a consistent 

theory, for n ^ 3"^’^(T), n E 3^’°(_L), and if n ^ 3^^^(p), n ^ 'ip), then 

n i ^ ^)) - = 3^^%p)U3^^%tp), 

and thus n ^ 3^’°('0) (we have used corollary 154 here). I p^’®] {bias) is identical 
to W. 

Note that both 3^’^ [/] and I p^’^] are defined by negated clauses. This is 
due to the definition of interpretation mappings by which the classical connec- 
tives are interpreted dually to their standard interpretation in possible worlds 
semantics (e.g. conjunctions are given by unions of interpretations, the verum 
is given by the least image under 3*^’®, etc.). 

Theorem 157 

Let 3^’® : C — » p{N) be an interpretation for a cumulative- ordered 
network agent, I : N \ {bias} p(W) \ {0} a labelling: 



1. 3^’^ [1] is an interpretation. 
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I is a labelling. 

4. I [/]] = l. 

Proof: 

In the following, let (f, £ C, arbitrary: 

1. First we show that [1] is an interpretation by proving the postulates 
from above to be satisfied: 

(a) [Z] (T) = {bias}U{n G AT \ {bias} \not: n T } = [bias], since 
T is satisfied by n relative to every possible labelling. 

[/] (_L) = [bias] U [n e N \ [bias] \ not: n ±] = N , since _L 
is not satisfied by n relative to every possible labelling. 

(b) Let THjc,o^i^ = {(p ^ C p^’^ [/] {(p) — {bias}}, and assume that 
TTl'^c,o]j^^ h (p ip: 

by corollary 154, ^ = {bias}, 

but by def.156, [/] {p \p) = {bias}U{n G AT \ {6ms} \ not: n 

{ip -tp)}, 

and thus for all all n e N \ {bias} it follows that n {p ip). 

If n ^ 3^^^ [1] {'ip) , then by def.156 again, either (i) n = bias and 
therefore n G 3^'*^ [/] {p), or (ii) not n ip: in this latter case, 
if n p, then by modus ponens n 'ip and we would have a 
contradiction; thus not n p, and by def.156 n G [/] {p). 

So in any case we have: 3^'^ [1] {p) D 3^'^ [/] {ip). 

(c) bias G 3^^^{p) directly by def.156. 

(d) [/] A [/] {p) u [/] i^p): 

we can concentrate on n ^ bias; 
n G 3^'^ [1] {p A ip) iff, by def.156, 
notn^lP^'^p iff, by the def. of ^i, 
not n^i p or not n ip iff, by def.156, 
ne3^^^ [/] {p)U3^^^ [/] {ip). 

2. I p^’®] is a labelling, since {w \^p G n ^ w ^ p} is non- 

empty as pointed out above. 

3. 3^’^[Zp^’^]] = 

we may focus again just on n bias; 
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(a) assume that n G [/ [3^^^]] {(f), and therefore not n ^/[ 3 c,o] (p 

by def.156. Thus by the def. of there is a world w G W, 

s.t. w f, but for all ijj with n ^ 3^'^{^) it holds that u; 1= '0. // 
n ^ 3^'^{f), then we would therefore have that w \= (p, which was a 
contradiction. Therefore, n G 3^^^{(p). 

(b) ''D assume that n G 3^^^{p). Since {'ll; e C\n ^ 3^'^ ('ll;) } is a con- 

sistent theory, as we have already shown above, and since p is not 
a member of this theory, there is - by the completeness theorem for 
propositional logic - a world w G W, s.t. w )i^ p, but where for all 
jp with n ^ 3^'^ {'ll)) it holds that w \= 'tp. w is a member of I [3^^^] (n) 
by def.156. Thus, not n p, and therefore n G [/ [3^’^]] {p) 

again by def.156. 

4. I [/]] = 1: 

again we may concentrate on n ^ bias; 

(a) assume that w ^ I [/]] (n). By def.156, w\^ p for all p with 
n ^ 3"^’° [/] {p). Thus, also w \= p for all p with n p. It follows that 
w G l{n), because if w ^ l{n) there would be a formula ip, s.t. w 'ip 
but n 'ip, since C has only finitely many propositional variables 
(take the disjunction of the world- descriptions of the worlds in l{n) ). 

(b) assume that w G l{n). For arbitrary p ^ C, if n ^ [/] {p), 

then n p by def.156, and thus also w \= p. But this implies that 
w G / [3^'^ [/]] by def.156 again. ■ 

As we have seen, labelling functions may be used to define interpreta- 
tion mappings in a neat way. But our net semantics as such does not presuppose 
a possible worlds semantics in any way. 

Example 158 

Let N = {bias,ni,n 2 }; let us restrict C for a moment to a propo- 
sitional language based on just two propositional variables p and q; let W = 
{wi,W 2 ,ws,W 4 } be the set of possible worlds for C, s.t. wi ^ pAq, W 2 ^ pA~^q, 
Ws \= ~^p A q, ~^p A -ig. Let I : N \ {bias} — ^ p{W) \ {0} be a labelling of 
N, s.t. l{ni) = {wi,W 2 }, andl{u 2 ) = Thus, e.g., n\ ^p, but neither 

n\ ^ q, nor Ui ^ ^q; U 2 ^ q, but neither U 2 ^ p, nor U 2 ^ 

It follows that, e.g., 

3^’^ [/] (p) = {bias} U {n E N \ {bias} \ not: n^p} — {bias, U 2 }y 

3^’^ [/] {q) = {bias} U {n E N \ {bias} fnot: n ^ q} = {bias, n\}, 

[1] (-ip) = {bias} U{n E N \ {bias} \not: n ^ ->p} = {bias, ni,U 2 }, 

3^^^ [/] {-^q) = {bias} U {n E N \ {bias} [not: n ^ ->q} = {bias, ni,U 2 }. 
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It is easy to see that 

I [/]] (m) = {weW\\/^e C: ni ^ [/] {if)^w\=ip} = 

{toi,u;2} = l{ni), 

and that also 

I p"’° [/]] (na) = {«; e W" |V(/? e C: ri 2 ^ ^ w \= ip} = 

{WI,W3} = l{n2). 

Now let us turn to the properties of the belief states of cumulative- 
ordered net agents. The following properties are again easy to see in light of 
the remarks above: 

Lemma 159 

Let 91 = (52/s,X, 9^, 3^’®) be a cumulative- ordered interpreted inhibi- 

tion network agent, let s be a parameter- setting in the parameter- setting space 
S of Sys: 

it follows for all that 

• {s and s iff s A -0) 

• if s\=fyi B^^^{(p) and s 1=^ xp), then s 

• ff B^^^{(f) or s B^'^{'ip), then s V 0) 

• given si^<yi B^^^{±): if s B^'^{^(f) then s B^'^{(p) 

(and accordingly, if B^^^ is replaced by B^). 

These are conditions which we would usually expect a rational belief 
predicate or operator to satisfy. 

Apart from belief simpliciter we have also introduced the concept of 
a total belief for net agents. In the case of cumulative-ordered agents we have 
(as follows again easily): 

Lemma 160 

Let 91 = {Sys,X,g,3^,3^'^) be a cumulative-ordered interpreted inhibi- 
tion network agent: 

it follows for all cp, E C that 

• if s Ngt AB^^^(ip) and 0 is strictly stronger than (p (according to T'H' 30,0 
or 3^^^), then s B^^^i^ip) 

• if s AB^'^{ip) and s B^^^{ip), then 0 is weaker than (p according 
to (but not necessarily according to Tl~L 3 c,o ) 

(and accordingly if B^^^ is replaced by B^). 
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Using def.144 and 150 we can associate theories of conditionals with 
cumulative-ordered interpreted inhibition network agents: 

Definition 161 (Conditional Theories Corresponding to Cumulative- Ordered 
Net Agents) 

For every cumulative-ordered net agent 01; 

1. let = {a[x] ^ (3[x] G |01 N a[x] ^ /3[x] }; 

'TH=^{^) is the (defeasible) conditional theory corresponding to 01 

let rn^{% - {a[x\ ^ (3[x\ G |01 N o[x] ^ (3[x\ }; 

TTL-^{^) is the universal conditional theory corresponding to 01. 

By def.144, TTY^(Ol) is also the total description of the nonmono- 
tonic inference dispositions of 01. Calling T7f^(01) a conditional theory will be 
justified by lemma 165 in the next section. 

Keep in mind that a[x] /3[x] G iff 01 N o [a:] ^ f3[x] iff 

T’"^(a[a] ^ f3[a]) = {bias} iff a[a] /3[a] G TH^c^o = {(^ G = 

{bias}} iff TTL3c,o h a[a] (5[a] (recall corollary 154); we are going to return 
to this point in the next section. Because of the strong association of T 1~L-^ (01) 
with TH3c,o we will often just say ‘theory’ in the following instead of the 
longer ‘universal conditional theory’ (see section 9.1). Note that TW_(01) is 
deductively closed with respect to universal formulas. 

The following definitions are in perfect analogy to the introduction of 
the semantical notions in part III: 

Definition 162 (Cumulative- Ordered Network Semantics) 

Let 01 = (S'ys,T, be a cumulative-ordered interpreted inhibi- 

tion network agent: 

1. a[x] j3[x\ G is called cumulative- ordered-net- valid iff 

for every cumulative-ordered interpreted net agent 01; 01 1= a[x] ^ ^[x] 

2. let KB^ C (KB^ is a conditional knowledge base): we say that 
m^KB^ iff 

for every (p[x] ^ ip[x] G KB^ it holds that 01 N (^[a:] ip[x] 

3. let KB^ C let a[x] => P[x] G C^: 
we say that 

KB^ o[x] ^ /?[x] 

(KB^ cumulative- ordered-net-implies a[x] ^ /3[x]) iff 

for every cumulative-ordered net agent 01; z/01 N KB^, then 01 N a[x] ^ 
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We loosely refer to the notions defined in this section by the term 
‘cumulative-ordered net semantics’. 

Example 163 

Let X\ he as defined as in chapter If; let us only focus on C with respect 
to the propositional language based on the propositional variables b (^hird”), f 
(^'flyer”), w (^'wings’’), p (''penguin”). 

Let = {bias^ni}, == {feias, ni, 77 , 2 }; 3i^{w) = {bias.ni, 

^ 3 }? = {bias.ni.n^}, := {bias}UNi \ 3i"^{(f), At/;) := 

U for all (p, 'll; e C. 

LetNf = {n{,n^,nl,nl}. Let3^{b) = {n{}, 3^{f) = {nf,n^}, 3l{w) = 
{n\,nl), 3l{p) = {<,<}, ;= Ni \3f (</?), 3{{ipAip) := 3\{i^) \J 3\{xp) 

for all p, 'ip ^ C. 

It is easy to see that 3^ and 3l'^ satisfy the postulates above. 

Let g : N^ N\ {bias}, s.t. g{nf) = Ui. li and g determine a system 
Sys, s.t. 31= {Sys,Xi, g,3i,3i'^) is a cumulative-ordered net agent. 

The definitions of (total) belief ascription in net agents entail: 

{bias,ui} B^^^(b), {bias,n\} B^'^{b\/ p), 

{bias, ni, 77,2, 713} Noi, B^^^{b), {6ia5, 77,1, 77,2, 77,3} B^^^{f Aw), 

{bias, 77,1, 77-4} B'^^'^{b Ap Aw), {bias, t^i, 773, 77,4} A 77;), 

but also 

[bias}i^^^ jB^’^(b), {bias, ni} B^^^{f), 

{bias, 771 , 772 , n 3 }i^ 07 i B^^^{p), {bias, t 7 i, 774 } B^^'^(f); 

{nf} Noi, ABP{b), {n?} AB^(byp), 

{n\,nl,nl} AB^(f Aw), {n\,nl} AB^{p), 

1=^1 ABP{bAp), and 
ABP{b), {n?,nf,ng}i^,i, ABP{b), 

{ 77 ^ , 774 } ABP{bApAw). 

{bias, 77i} AB^'^{b), {bias, 771} AB^'^{b\/ p), 

{bias, 771,772, 773} AB^^'^if Aw), {bias, T7 i, 774} AB^'^{p), 

{bias, 77 i, 774} AB^^^{b Ap), and 

{bias}}i^^^ AB^'^{b), {bias, t 7 i, 772 , 773 } AB'^^'^ib), 

{bias, 77 i, 774} AB^'^{b Ap Aw). 

Prom the definition of N for networks and defeasible conditionals, it 
follows that 

3li h 

b[x] f[x] Aw[x],b[x] A f[x] w[x],b[x] Ap[x] => ~^f[x] Aw[x],b[x] 
-^p[x],b[x]V p[x] f[x],..., 

01i>^ 

b[x] => p[x],p[x] => f[x], f[x] p[x], T b[x], T f[x] A w[x],p[x] => 
^p[x],w[x] p[x],b[x] A p[x] ^ f[x], . . . 
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Thus, e.g., if all that initially perceptually believes is that [b is 
true], i.e. (after one state-transition to the central system which has initially 
zero activity apart from the bias, the central system looks like this ) 







na 



Fig. 4‘' {bias,ni} AB^'"^{h) 

then Vii finally centrally believes that [f Aw is true], i. e. : 



Hh 



na 



^ >• 

n7 na 

Fig. 5: {6ms, ni , ri 2 , ns} 1=^^^ /\w) 

Moreover, if all that initially perceptually believes is that [b A p 
is true], i.e. (after one state-transition to the central system which has zero 
activity apart from the bias, the central system looks like this ) 



Hh ria 



ni 



■>o 



na 



Fig. 6: {bias,ni,nT\ AB^'^{bAp) 
then 0^1 finally centrally believes that /->/ Aw is true], i.e. 

Hh na 



ni 






na 



Fig. 7; (6ias, m, ns, n 4 } Aw) 

The trajectories corresponding to these two nonmonotonic inferences 
are (by the def. of the state-transition mappings 



(s(0),s(l),s(2)) 
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where 

s(0) = ({n?} , {bias} , (El, /i) , s“(0), nc((Ei, Ji)), na((Ei, /i))), 
s(l) = ({n?} , [bias, rii] , {Ei,h ) , s“(l), nc{{Ei,h)), na{{Ei,h))) , 
s{2) = 

{{nl) ,{bias,ni,n 2 ,nz} , (Ei, /i) , s“(2), nc((Ei, 7i)), na((Ei, /i))) 
and 5 ( 2 ) is stable under (s^{0) G S^, arbitrary), and 

{s'{0),s'{l),s'{2)) 

where 

s'(0) = ({n^,n^} , {bias}, (El, 7i) ,s'“(0),nc((Ei,7i)),na((Ei,/i))), 
s'(l) = 

({n?, n^}, {bias, ni, 714 }, (El, 7i) ,s'“(l),nc( (El, 7i)),na((Ei,7i))), 
s'( 2 ) = 

{{n{, nl} , [bias, ni, ri 4 , na} , (Ei , 7i) , s'“(2), nc((Ei , 7i)), na((Ei , 7i))) 
and s'(2) is stable under {s'“(0) € S", arbitrary). 

It follows from our definition of inference ascription that 
(s(0), s(l), 5 ( 2 )) 1= b ^inf f Aw, and 
(s'(0),5'(1),s'(2)) \=bAp^inf -^f Aw. 

Now suppose we extended E\ by {bias, ni): then it would also follow that 
N {T ^ b[x],T f[x] A w[x], . . .}, i.e. if all that Ji perceptually believed 
was that “there is something’’ , i.e. if it had no information at all, it would 
automatically infer “there is a bird, which is able to fly and has got wings”, 
which might be plausible if, e.g., were itself a bird living in a cage only 
populated by other (normal) birds. Thus, on the cognitive level, the role of the 
bias node is to enable plausible nonmonotonic inferences from zero beliefs. 

From the definition of t= for networks and universal conditionals, it 

follows: 

f[x] ^ 6[x],p[x] ^ b[x], . . . (including all logically valid 

formulas), 

f[x] w;[a;],u;[a;] -> f[x],f{x] p[x],p[x] f[x],w[x] -> p[x],p[x] 



There is a variety of further interesting connectives and notions which 
may be defined in order to describe the cognitive properties of interpreted nets. 
However, for the rest of the paper we will restrict ourselves only to the study 
of the defeasible conditional and of the notions of conditional theories and 
ent ailment defined for the network semantics. 
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16.2 The Representation Theorem for CL 

Let us recall the following notational conventions and results: 

• there is an s G S', s.t. s B^{a[x] ^ /3[x]) iff for all s e S: s \=^yi 
B^{a[x] => P[x]) iff ^ N B^{a[x] ^ /3[x]) iff 91 1= a[x] => P[x] (see 
corollary 146); in this section we will concentrate on the last notation, 
and thus we say that a conditional a[x] ^ P[x] is satisfied by, or is true 
in, a cumulative-ordered net agent, when we actually ascribe a general 
defeasible belief to the agent 

• we know that 91 N a[x] => /3[x] iff Cl{g{3P{a[a]))) R^’°(/3[a]) iff 

9^’^(/3[a]) C Cl{g{3^{a[a]))) iff 3‘^’^(/?[a]) C {a[a])) (this is corol- 

lary 145): we are now going to use the last claim in order to show that 
91 does or does not satisfy some conditional, and in this way we can fo- 
cus just on the central subsystem of 91 while disregarding the perceptual 
subsystem 

• there is an 5 G S', s.t. s B^{a[x] (3[x\) iff for all 5 G S: s 

B^{a[x] ^[x]) iff 91 N B^{a[x] (i[x\) iff 91 1= a[x] — » /3[x] iff a[x] 

P[x] G (iff TW_(9I) h a[x] (5[x\ iff a[a] -> /3[a] G TH 3 c,o) iff 

h a[a] f3[a] (see corollary 151, def.161, corollary 154, and our 
remarks in the last section); we will use the last clause when we ascribe 
a universal belief. In particular, we are going to make use of the mutual 
equivalence of (i) a[x] ^ P[x] e TW_^(91), (ii) TH_(91) \~ a[x] ^ p[x], 
and (iii) T 7 i 3 c,o h a[a] P[a] in the subsequent proofs without stating 
this explicitly. 

16.2.1 The Soundness Lemma for CL 

In this section we will take a look at the correct rules of inference for the general 
defeasible beliefs of cumulative-ordered net agents. Put differently, such rules 
will be closure properties of the following prototypical form: if a conditional 
so and so is believed by a cumulative-ordered interpreted network agent, and 
furthermore a conditional so and so is believed by the same net agent, then 
also the conditional so and so is believed by this net agent. 

The closure properties satisfied by cumulative-ordered nets will be 
proved to be those of the system CL of cumulative reasoning with Loop (in- 
troduced by Kraus et al.[85]; see our section 9.3 of part III). For our proofs we 
make use of the results of section 14.3. 

In order to prove this soundness result we need another lemma first: 

Lemma 164 

Let 91 = (*S'ys,T, 9^, 9^’^) be a cumulative-ordered interpreted inhibi- 

tion network agent, let a[x] (3[x\ G C^: 
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N a[x] => P[x] then Cl{3^'^{a)) = A /?)). 

Proof: 

Since C Cl{3^'^{a)) by assumption (recall remark 145) and 

C Cl{3^'^{a)) by remark 131, we have A/3) C Cl{3^'^{a)), there- 

fore 3^^^{a) C 3^^^(aAP) C Cl{3^'^{a)), and thus by lemma 132 Cl(3^^^{a)) = 

C/p"’^(aA/3)). ■ 

Lemma 165 ( CL-Soundness I) 

Let 31 — {Sys,X, g,3^ ,3^^^) be a cumulative- ordered interpreted inhibi- 
tion network agent: 

thenTH=^{3l) is a consistent conditional CL-theory extending TH^ {31). 
Proof: 

Let I = {N,E,I,bias) be an FHIN and 31 = {Sys,l,g,3P ,3^'^) be a 
cumulative- ordered net agent. 

Let {Nq, . . . , Nk) be the canonical partition ofX: 

1. Reflexivity: 3l\= a[x] => a[x]; 

since 3^'^{a) C Cl{3^^^{a)) by remark 131. 

2. Left Logical Equivalence: if THjc,o \- a /3, 31 \f a[x] => ^[x], then 
31 N P[x] ^ 3 [x]; 

since in this case 3^'^ {a) = 3^’^(/3) by corollary 154 thus ^'^’^( 7 ) C 
Cl{3^^%a)) = Cl{3^^^{p)). 

3. Right Weakening: ifTH^c^o h a — > /3, 01 N ^[x] ^ a[x], then 01 N ^[x] ^ 

P[x]; 

since by assumption it holds that 3^^^ {a) 5 0^^’^(/3) according to postulate 
2 for 3^'^ and therefore 3^'^{P) C 3^'^{a) C Cl{3^^^{'j)). 

4 . Cautious Cut: if 31 ^ a[x] A /3[x] 3[x], 01 N a[x] /3[x], then 01 N 

a[x] ^ 7[a;]; 

because we know from lemma 164 that by assumption Cl{3^^^{a)) = 
Cl{3^^^{aAP)). But also by assumption it holds that 3^^^{'y) C Cl{3^'^{aA 
P)) and we are done. 

5. Cautious Monotonicity: if 31 1= a[x] P[x], 01 N a[x] => 3[x], then 
01 N a[x] A P[x] ^ 3 [x]; 

for again by assumption and by lemma 164 have that Cl{3^^^{a)) = 
Cl{3^'^{aAP)); but by assumption we also know that 3^^^{'y) C Cl{3^’^{aA 
P)), and we are done again. 
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6. Loop: Suppose that N ai[x] a^+i[x] for i = 0, . . . , j (again addition 
is understood modulo j + 1); in this case we have 

C C/p^’"(ao)),a^’V2) £ C 

Cl{3'^'^{aj-i)), and 

3^^^{ao) C Cl{3^'^{aj)). By lemma 133 it follows that Cl{3^^^{ao)) = 
... = Cl{3^^^{aj)) and thus by remark 131, 3^^^{ar') C C/(T’^(ar')) — 

TH^{3\) is consistent, since Cl{3^^^{T)) = Cl{{bias}) C Cl{N) = 
C/p"’^(±)) by defl53. ■ 

Lemma 166 

Let X — {N, E, I, bias) be an FHIN and let 31 = {Sys,X, g, 3^, 3^'^) be 
a cumulative- ordered net agent: 



1 . 



if I = 0, then T'H^{31) is closed under the rules of Monotonicity, Tran- 
sitivity and the “Easy Half of the Deduction Theorem”, i.e. 



a[x\ => {(5[x\ 7 [a;]) 

a[x] A (5[x\ => 7 [x] 



(EHD) 



2. if I ^ 0 , then is not necessarily closed under Monotonicity, 

Transitivity and EHD. 



Proof: 

By remark 131, Monotonicity holds for ^ if I = 0. On the other hand, 
if I ^ 0 Monotonicity is obviously no longer necessarily the case. See Kraus et 
al.[85], p.180 for the proof that Monotonicity is equivalent to transitivity and 
EHD in the presence of the axioms/rules of C = [CL without Loop] (or recall 
chapter 10). M 

The next lemma says that the theories defined by inhibition nets are 
not sound relative to the well-known system P of preferential reasoning (see 
chapter 10), which is stronger than CL: 

Lemma 167 

TH:=^{31) is not necessarily closed under 

a[x] ^ -y[x], 0[x] -f[x] 

q[x] V j3[x] 7 [ x ] ' 

Proof: 

As a counterexample consider N = {bias, n\,n 2 ,n^} s.t. E — {{ni,ns) , 
(« 2 ,« 3 )}, I = 0; then {ns} C Cl{{ni}), { 713 } C Cl{{n 2 }), but { 713 } ^ 
Cl ({tii} n { 712 }) = Cl (0). Add corresponding interpretations. ■ 
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Lemma 168 

Tl-l=^{Tt) is not necessarily closed under Contraposition. 

Proof: 

Consider N = {bias, ni, 712 } s.t. E = {(ni, 712 )}, I = 0; 
in this case, {bias,ni,n 2 \ C Cl {{bias, ni}), butN\{n\} = {bias,n 2 \ 
^ Cl {N \ { 771 , 772 }) = Cl ({ferns}). Again add corresponding interpretations. ■ 

The trivial counterexamples to Or and Contraposition show that the 
invalidity of the latter rules has nothing to do with the presence of inhibition 
but that it is rather a consequence of the definition of the net semantics as 
such. It is also easy to see that the Rational Monotonicity rule is not generally 
satisfied by interpreted nets. 

16.2.2 The Completeness Lemma for CL 

Now we prove a result (roughly) analogous to a version of the completeness 
theorem for classical logic which states that every consistent theory has a model. 
We say just as in part III that W is the set of worlds satisfying a given theory 
TH-. C C-^, if W is the set of propositional variable settings w, s.t. for all 
a[x] p[x] G TH^: w h a[a] P[a\. 

Lemma 169 (CL- Completeness I) 

Let T7i-^ C fee a theory, i.e. deductively closed: 
for every consistent conditional CL-theory TH^ C extending TH-, 
there is a cumulative- ordered interpreted inhibition network agent 51 = {Sys,X, 
g,3^,T^^), s.t. 

• TH-.{% D TH^, and 

• TTi^ = TTt^{Tt), i.e. for every a[x] /3[x] G C^: 

a[x] => f3[x] G TH^ iffTt\=^ a[x] => P[x]. 



Proof: 

By the completeness theorem 105 stated in chapter 11 (and proved 
by Kraus et al.[85], pp. 188-189), for every as above there is a finite 

cumulative- ordered model (based on the set W of worlds satis- 
fying TH^), s.t. a[x] => /3[x] G TH^ '^ff^co ^co states 

minimal with respect to which make a true, also make (3 true. In the follow- 
ing we use Wl^o to construct the intended interpreted network 51. We use ^s’ 
with or without index to range over states in the sense of cumulative- ordered 
models, and ^s ’ as usual to range over parameter-settings. 

Let N = {fe7a5}u5. LetE = {{bias,s) |s is not minimal accordingto -< 
} U {(s, s') \s -< s'}. For every s e S let Lg = {s' G 5|s' -< s}; say, Lg = 

{si , . . . , Sr-g^}. 
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Now we define 

Is = {{bias, {si,s)) , {si, {s 2 ,s )),.. . ,( 5 ^ 3 - 1 , (sr^,s)) , {Sr^, {bias,s))}. 

If s is minimal in <, then let Is = 0. Let I - U.e5 /g. Obviously, I C N x E. 
Since X = {N, E, /, bias) does not contain any cycles, and since X is finite 
(because S is), X is an FHIN. 

We define for (p G C: = {bias} U {5 |5 does not make (p true}. 

Obviously, is an interpretation mapping. Define N^, 3^, and g as expected, 
s.t. 3^^^{(p) = {bias} U g{3^{(p)). X, g, 3^, determine a system Sys, s.t. 
31 = {Sys,X, g,3^ ,3^'^) is a cumulative- ordered interpreted inhibition network 
agent. 

Now we show that t= a[x] P[x] iff 31 \= a[x] ^ P[^], which 
immediately entails: a[x] /3[x] G iff3l\=a[x] P[x]. 

Let a ^ C. We will prove by induction that s G S does not fire in the net 
state Cl{3^'^{a)) iff s is a minimal a-state according to 3JVf^. Let {Nq, . . . , Nk) 
be the canonical partition of X as usual. 

• Induction basis: 

let s e No (s ^ bias since s e S); 

Cl{3^^^{a)){s) — 0 iff s ^ iffs is an a-state. Moreover, every 

state s in No is minimal according to -< by def. of E and I. 

• Induction step: 

assume that for every s G Aq U . . . U Cl{3^'^{a)){s) = 0 iff s is a 
minimal a-state. Now consider an arbitrary s G Nf^i: 

Cl{3^^^{a)){s) = 0 iffs^ 3^^^ (a) and -i3m G Nj with j < i s.t. 
(Cl{3^'^{a)){m) = l,m E s,-^3m' G Nu with u < i{Cl{3^'^{a)){m') = 1, 
m' / (m,s))). But this is the case if and only if s is a minimal a-state, 
for the following reasons: 

first, s ^ iffs is an a-state; at second, by def. of E and I, G 

Nj s.t. 

j <i and Cl{3^'^{a)){m) = l,m £* 5 , -<3m' G Nu with u < i s.t. 

{Cl{3^^^{a)){m') = l,m' I (m,s)) iff 

Vs' G Ls it holds that C/(T’^(a))(s') = I iff 

(by induction hypothesis) Vs' G Lg it holds that s' is no minimal a-state. 
But s is an a-state and Vs' G Lg (s' is no minimal a-state) iff s is a 
minimal a-state (by the Smoothness Condition). Therefore, we have that 
Cl{3*^^^{a)){s) — 0 iffs is a minimal a-state. See fig. 8 for a visualisation 
of the induction step: 
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Fig. 8: Situation in the Induction Step 

We know that N a[x] => p[x] iff all minimal a-states are j3-states. 
But the latter is the case, if and only if for all s G S: if Cl {3^^^ (a)) (s) = 0 then 
s i or equivalently, for all s G S: ifs G 3^^^{l3) then Cl(3^^^{a))(s) = 1. 

So, N a[x] ^ P[x] iffm^ a[x] /3[x]. ■ 

If we take soundness and completeness together we get the following 
representation theorem: 

Theorem 170 (CL-Representation) 

Let T C be a theory: 

T7i^ C is a consistent conditional CL-theory extending TTL-^ 
iff there is a cumulative- ordered interpreted inhibition network agent ^ = 
{Sys,I,g,3P,3^'^), s.t TH-.(3l) D TH-., andTH^ = Tn^{3l). 

Proof: 

Lemma 165 proves the direction from the right to the left, lemma 169 
proves the direction from the left to the right. ■ 

Apart from the versions of soundness/completeness given above there 
are several useful reformulations and consequences: 

Corollary 171 

Let TH^ C be a theory, a[x] => P[x] G KB^ C C^: 

1. a[x] => l3[x] is true in all cumulative- ordered interpreted inhibition net- 
work agents, s.t. ^ TTi-^ i-ff for all consistent conditional CL- 

theories TTi^ extending TH^: a[x] P[x] G TH^ 

2. a[x] => p[x] is cumulative- ordered-net- valid iff for all consistent condi- 
tional CL-theories TH^ extending the universal ded. closure dc({T}) of 
{T}; o[x] P[x] G TH^ 

3. (CL-Soundness/CL-Completeness II) 

a[x] => P[x] is cumulativ e- ordered-net- valid iff a[x] => P[x] is CL-provable 
(rel. to dc({T})^ 

4> KB^ a[x] => P[x] iff for all consistent conditional CL-theories 

TH^ 2 KB^, s.t. extends dc({T}); a[x] (3[x] G 
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5. (CL- Soundness /CL- Completeness III) 

KB^ ^ p[x] iff KB^ a[x\ ^ (i[x\. 

Proof: 

1. apply theorem 170; 

2. apply theorem 170 to 1 in definition 162; 

3. use 2 from above; 

apply theorem 170 to 3 in definition 162; 

5. assume that KB^ a[x] ^ fi[x], and for reductio suppose 

that there is no derivation of a[x] => [3[x\ from KB^ in CL. Now let 
TH^ = Ded^^^^^\KB^). TH^ is a conditional CL-theory, TH^ O 
KB^ andTH^ is consistent by remark 84- By 170 there is a cumulative- 
ordered net agent 0^, s.t But then Ot N KB^ and 

a[x] => j3[x\ contradicting KB^ a[x] ^ P[x]. 

assume that there is a derivation of a[x] fi[x] from KB=^ in CL; 
if KB^ is inconsistent then the left-hand side follows trivially. Otherwise 
if for some arbitrary cumulative- ordered net agent 0^ we have 0^ N KB^, 
then rn^{% D KB^, and is a consistent conditional CL- 

theory by 170; thus But then a[x] =^> 

P[x] e and therefore 0^ N a[x] => /3[x]. ■ 




Chapter 17 

CUMULATIVE-ORDERED INTERPRETED INHIBITION NET 
AGENTS AS IDEAL AGENTS 



We can use the results which we have obtained in the last chapter in order 
to show that there are ideal cumulative-ordered interpreted inhibition network 
agents, and these agents are definitely low-level agents. 

First of all, since we have fixed the topology of the inhibition nets 
which are used as the central subsystems of our interpreted inhibition network 
agents, every general belief and every inference of such an agent is basic.* This 
is easily shown by the help of the definitions of section 8.1 in chapter 8, and 
corollaries 146 and 151 (note that we do not have to deal with the distinction 
between direct and indirect inferences, since, as we have emphasized before, this 
distinction collapses in the case of network agents). The next lemma expresses 
this property of our net agents: 

Lemma 172 (General Beliefs and Nonmon. Inferences are Basic for Our Net 
Agents) 

Let ^ = {Sys,X,g,3P,3^'^) be a cumulative-ordered interpreted inhibi- 
tion network agent; 

let a[x] f3[x] G a[x] /3[x] G 

let {s{t))^^j^ be an arbitrary trajectory of 31; 

let {s{t — Ate), • • • , s{t)) be the subsequence of {s{t))^^j^ from time t — 
Ate to t (for arbitrary t): 

1. A has at t rel to {s{t))^^j^ the basic universal belief that [a[x] (3[x\ is 

true] ifJ3l\=^ B^{a[x] /?[x]) 

2. A has att rel. to {s{t))^^j^ the basic general defeasible belief that [a[x] 
/3[x] is true] iff3l\=^ B^{a[x] ^ !3[x\) 

3. A draws from t — Ate to t rel. to {s{t))^^j^ the basic nonmonotonic in- 
ference from a[a] to P[a] iff 

(s(t- Ate),...,s(t)) N a[a] ^inf /3[a]. 



We have already pointed out something similar in example 36 in part II. 
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Now let us reconsider the “actual” model Tlact = {^act^ ^act^^act^ • • •) 
from parts I-III, and let us assume that its normic component model ^act 
a cumulative-ordered model; this is not a very demanding assumption, since 
the intended models for normic conditionals are usually regarded to be spe- 
cial cases of cumulative-ordered models, including preferential or even ranked 
models (compare our comment at the end of part III). 

We recall def.50 of high first-order reliability in the normic sense from 
section 8.2: 

let a [a], P[a] G C] 

Definition 173 (High First-Order Reliability in the Normic Sense) 

R is highly reliable in the normic sense to draw an inference from a[a] 
to p[a] iff 

1= =>nor /?N. 

Let Tn^{Tl2ct) = {a[x] ^ /3[x] e \m2ct 1 = a[x\ ^nor P[x] }. Let 
TH^ C be the set of universal a[x] /3[x] that are satisfied by the set 

W of worlds on which Tlact is based, i.e., TH^ = 

is ^ conditional CL-theory extending TH^ (for the terminology, 
see part III). By lemma 169 of the last chapter, there is a cumulative-ordered 
interpreted inhibition network agent "^ideah s.t. 

• and 

• i.e. for every a[x] => /3[x] G 

a[x] P[x] e iff ^ideai ^ B^{a[x] ^ /3[x]) 

(now we write ^^ideai B^{a[x] P[x]y again instead of the equivalent 
'^ideal C^[x] ^ P[xy). 



We claim that ^ideai is indeed ideal. Since we have restricted ourselves 
to nonmonotonic inferences in network agents, since we do not have to dis- 
tinguish direct from indirect nonmonotonic inferences, and since every such 
inference in a network agent is basic, we only have to check (recall def.54 in 
section 8.4) whether every inference that is drawn by ^ideai is in fact highly 
reliable in the, say, normic sense. But since it is highly reliable in the normic 
sense to draw an inference from a[a] to p[a] iff dJlact ^ o:[x] =^nor P[x] iff 
a[x] /?[x] G iff ^ideai ^ B^{a[x] P[x]), we have by def.54, 

and def.58 of section 8.6: 

Theorem 174 (Existence of Ideal Low-Level Agents I) 

Let a[a], f3[a] G C; 

let {s{t))^^j^ be an arbitrary trajectory of^ideaii 
let {s{t — Ate), • • • 7 s{t)) be the subsequence of {s{t))^^j^ from time t — 
Ate to t (for arbitrary t): 
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1. {t - Ate, (s(i))tg/„ 1= J{a[a] =^inf P[a\) iff 

{s{t - Ate), ■■■, s{t)) t= a[a] ^inf /3[o] 

2. A is ideal with respect to {s{t))^^j^. 

Since {s{t))^^j^ is arbitrary, we may finally conclude by def.59 of sec- 
tion 8.6: 

Theorem 175 (Existence of Ideal Low-Level Agents II) 

^^ideai ideal. 



Let us add that by lemma 165, the nonmonotonic inferences drawn 
by any net agent - whether ideal or not - obey the same closure properties 
that the set of justified nonmonotonic inferences obey according to corollary 
112 of chapter 12. This shows that the existence of ^ideai is no “singularity”: 
there are definitely many cumulative-ordered interpreted inhibition net agents 
which are not ideal, but which are good approximations to ideal agents, since 
the cognitive architecture of cumulative-ordered net agents is “tailor-made” for 
justified nonmonotonic inference implementation in general. 



Before we discuss this result in chapter 20 by confronting it with pos- 
sible objections to some of its underlying assumptions, we will first compare in 
chapter 18 the class of our net agents, and in particular their central inhibi- 
tion net systems, to some related symbolic implementations of nonmonotonic 
inferences. This is going to highlight the specific properties of the cognitive 
architecture of our ideal low-level agent ^ideah In chapter 19 we will show that 
we might just as well have used an artificial neural network (with weights, etc.) 
as the central system of our ideal agent ^ideai • The only difference would have 
been a slightly more complex architecture, and slightly more complications 
concerning the proofs of the results in chapter 16. 




Chapter 18 

INHIBITION NETS AND OTHER FORMS OF 
NONMONOTONIC REASONING 



There are various connections between inhibition nets and other mechanisms of 
nonmonotonic reasoning. We will again restrict ourselves to finite, hierarchical 
inhibition nets. 

Let us begin with logic programs. In chapter 23 one can find the basic 
definitions and standard results for the blossoming field of logic programming, 
as far as we need them in this section. By means of the following two definitions 
we will show how to associate finite hierarchical inhibition nets with logic pro- 
grams of a certain important type, and vice versa. First of all, given an FHIN 
we can construct a “counterpart program” in which excitatory connections are 
simulated by rules with positive bodies, and in which inhibitory connections are 
replaced by negation as failure. Input states will be transformed into body less 
rules: 

Definition 176 

Let X = (N, E, /, bias) be an FHIN: 

the program II(X) associated with X is defined in the following way: 

1. use N as the set P of propositional variables (but if there is no edge from 
bias to some other node or edge, simply drop bias ) 

2. for each n E N add all rules of the form n <— n' , not , not nj, where 

• n' E n, 

• for all i with 1 ^i ^ j: ni I {n',n), 

• for all n" E N : if n" I (n', n) then 3i with 1 ^ i ^ j s.t. n" = ni 

3. do not add any further rules. 

Let 5* E S; the program U{X,s*) associated with the net X and the 
input s* is defined as follows: 

1. use N as the set P of propositional variables (but, again, if there is no 
edge from bias to some other node or edge, drop bias ) 
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2. add all rules contained in II(T) 

3. add all bodyless rules with head n iff s"^{n) = 1 

4 . do not add any further rules. 

The next definition shows how to simulate any given finite, normal, 
hierarchical logic program by means of an inhibition net in which conjunction 
nodes replace the positive parts of rule bodies, and in which inhibition lines re- 
place negation as failure. Recall from theorem 134 that we can always construct 
a conjunction node ni A . . . A of nodes ni, . . . , (if i = 1 then we regard 
ni as a “conjunction” node); for the construction we have to add a subnet of 
auxiliary inter-nodes. The input associated with a program will be set to the 
class of heads of its body less rules: 

Definition 177 

Let U be a finite, normal, and hierarchical program ( allowing for nega- 
tion as failure) based on a set P of propositional variables: 

the inhibition net T{J1) = {NY[,Exi,Iii,biasYi) associated with II is 
given as follows: 

1. Nu = PU{biasu} U the set of auxiliary nodes needed for the construction 
of conjunction nodes; biasu is some object that is not contained in P 

2. for all n,ni, . . . , Ui-^j e N: {ni A ... A Ui) Eu n iff there is a rule n ^ 
ni, . . . ,Ui, not . . . , not nij^j in U 

3. for all n,n' ,ni, . . . , n^, n^+ 2 , • • • , ^ N'; n' I ((ni A . . . A n^) , n) iff 

there is a rule n ni , n 2 , . . . , ni,not n' , not n^+ 2 ? • • • ? '^ot ni^j in II. 

The input 5* (II) associated with II is defined as {biasu} joined with 
the set of all propositional variables n, s.t. n is contained in U as a bodyless 
rule. 



Note that X(II) has more nodes than II has propositional variables. 

The following theorem proves the two definitions above to be sound 
and compatible: 

Theorem 178 

Let X — {N, E, /, bias) be an FHIN, s* e S, H be a finite, normal, 
hierarchical program: 

1 . n(X) andn(Z, s*) are finite, normal, hierarchical programs 

2. J(n) is an FHIN, 5* (II) is a state ofX{TL) 

3. Cn(n(T,s*)) = C/j(s*) 
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4- C'/j(n)(5*(n)) \ [{biasu} U the set of auxiliary nodes] = Cn{IVj 

Proof: 

1. Directly by def.176; use lemma 121. 

2. Directly by def. 1 77 and def 21 5 of being hierarchical. 

The other claims follow from def.176 and 177, and def. 21 2 of answer 
sets. By proposition 218 answer sets are identical to stable states in the case of 
finite, normal, hierarchical programs. ■ 

Claims 3 and 4 capture what we mean when we say that finite, normal, 
hierarchical logic programs and FHINs are equivalent concerning their dynam- 
ics. An analogous theorem may be stated for finite normal hierarchical basic 
programs and inhibition nets without inhibitory connections. 

By theorem 178 logic programs (e.g. implemented as Prolog programs) 
may be used to generate the closures of input states of FHINs. Furthermore, all 
of the notions and results on inhibition nets presented above may be applied 
to finite, normal, and hierarchical logic programs, and also vice versa. E.g., it 
is well known that answer sets for finite normal programs are minimal models 
of the translation of the programs into the language of classical logic, and that 
the classical interpretations which are closed under and supported by a logic 
program are precisely those interpretations which satisfy the completion of the 
program (see Lifschitz[95], pp. 106-107), and so forth. These results are now 
also applicable to inhibition nets. 

In analogy to (partially) interpreted inhibition nets we can also speak 
of a (partially) interpreted finite, normal, and hierarchical logic program, which 
may be defined in the manner of def. 153. By theorem 178 also interpreted logic 
programs and interpreted inhibition nets are intertranslatable, where we use 
the interpretation mapping of the given interpreted program or network as 
the interpretation mapping of the intended translation of the given interpreted 
program or network. But note that the transition from an interpreted logic 
program to an interpreted inhibition net in the way of def. 177 will actually 
result in a partially interpreted inhibition net, since the nodes which have to be 
added in order to get a corresponding inhibition net are actually not contained 
within the images of the interpretation mapping of the given interpreted logic 
program. 

Inhibition nets may also be considered as default theories (Reiter [131]) 
of a certain restricted kind. Here we can again apply a result for logic pro- 
grams: see the end of appendix B for a translation of logic programs into de- 
fault theories. Using theorem 176 we can thus also translate inhibition nets 
into default theories. By proposition 219 the deductive closures of answer sets 
are identical to the extensions of the translated logic programs. By theorem 
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178 this also holds for the closures of the translated inhibition nets with an 
input. The well known translations from default logic to autoepistemic logic 
(e.g. Konolige[84]) are therefore as well applicable to inhibition nets. Moreover, 
Marek&Truszczyhski[101], pp. 374-376, directly translate logic programs into 
sets of formulas of autoepistemic logic, and also this result may be used to 
translate inhibition net reasoning into autoepistemic reasoning. 

Similar to the transformation of inhibition nets into logic programs 
and vice versa, we can also mutually translate inhibition nets into truth main- 
tenance systems (Doyle[40]) presented in a more abstract setting as nonmono- 
tonic formal systems (Brewka[26], pp. 125-143) or nonmonotonic rule systems 
(Marek&Truszczyhski[101], pp. 376-380). The rules of such systems have the 
prototypical form ‘IF ai , . . . , UNLESS fei , . . . , 5^ THEN c’ analogous to the 
normal rules in logic programs. 

Various other nonmonotonic reasoning mechanisms do not seem to 
have to do much with hierarchical inhibition nets, although they are also based 
on directed acyclic graphs: e.g. inheritance nets^ which lack inhibition, while 
inhibition nets lack negative paths. 

After pointing out the similarities between inhibition nets and some 
kinds of nonmonotonic reasoning mechanisms we should now turn to the dif- 
ferences. A superficial difference is that we have partially employed abstract 
imitations of the concepts used in dynamical systems theory and connection- 
ism, where nonmonotonic reasoning usually presupposes the language and the 
concepts of classical AI. But the essential distinction between interpreted nets 
and the other approaches lies on the interpretative level, i.e. on the level where 
some kind of meaning is assigned to entities like nodes, or to the processes 
acting upon these entities. The main idea used in logic programming, default 
reasoning, and truth maintenance systems is (i) to assign meaning to the very 
entities (nodes propositional variables) which are used as the constituents of 
the local rules governing the nonmonotonic inference process, and (ii) to assign 
meaning to the local rules themselves in some way. E.g., we might implement 
a logic program using the propositional variables 5, p, / and the single rule 
/ <— 5, not p. The entities having representational function are the proposi- 
tional variables b, p, / standing for birds, penguins, and fiyers respectively, and 
the local rule f ^ b, not p by which birds are believed to be fiyers as long 
as they are not believed to be penguins. The propositional variables which are 
subject to this local rule are thus also interpreted. On the other hand this is not 
true for interpreted inhibition nets: while the entities which represent are the 
patterns of activity, the entities subject to local activation rules are the nodes 
in a network. There are also no “abnormality nodes” but abnormality is repre- 
sented in the network implicitly. Edges are not interpreted at all in the case of 
inhibition nets, and generally it will even be impossible to read any content into 
a single connection. Finally, note that in logic programs propositional variables 
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are used as nodes, whereas in interpreted networks propositional variables are 
interpreted as sets of nodes. 

Furthermore, all of the approaches cited in this section belong to the 
family of consistency- / fixed point based systems of nonmonotonic reasoning as 
opposed to the preference-based systems (for more on this vague but useful 
distinction see e.g. Brewka et al.[28]). The former family shares a reluctance to 
cumulativity, which the latter is inclined to. In some way interpreted inhibition 
nets seem to bridge the gap between the two rivalling families: on the local level, 
i.e. on the level of nodes and edges, they are quite similar to the consistency- 
based approaches to nonmonotonic reasoning as the translations from above 
show us. But on the global level of patterns and beliefs ascriptions they behave 
like preference-based systems as proved by the representation theorem of the 
last section (other bridges between the two families have been drawn e.g. by 
Brewka[25]). 




Chapter 19 

INHIBITION NETS AND ARTIFICIAL NEURAL NETWORKS 



An artificial neural network (ANN) with a distinguished external input may 
be considered as a tuple ([/, IF, A, O, NET, ex) having the following properties 
(this is the definition stated by Nauck et al.[113], pp. 19-24, where also a general 
overview of ANNs is to be found): 

1. U is a finite and non-empty set of units 

2. W : U X U ^ R is the pattern of (weight-) connectivity, which assigns a 
weight to each edge between units 

3. A is a function which maps each unit u G t/ to an activation mapping 

A^i : ^ R, s.t. the activation state au{t + 1) of u at time t -h 1 is 

dependent on the previous activation state a^(t) of u, the current net 
input netu{t + 1) of u, and the (constant) external input ex{u) fed into 
u, i.e. 

au{t + 1) = Au{au{t),netu{t + l),ex{u)) 

4. O is a function which maps each unit u G U to an output mapping Ou : 
R — > R, s.t. the output state Ou{t+l) of u at time t-\-l is solely dependent 
on the activation state au{t + 1) of u, i.e. Ou{t + 1) = Ou{ciu{t + 1)) 

5. NET is a function that maps every unit u e U to a net input (or propa- 
gation) mapping 

NETu : (R X R)^ ^ R, s.t. the net input netu{t + 1) of u at time 
t -h 1 depends on the weights of the edges which lead from units u' to 
u, and on the previous output states of the units u' , i.e. netu{t -h 1) = 
NETu{Xu'.{W{u',u),Ou'{t))) 

6. ex : U — > R is the external input function, which we assume to be con- 
stant. 

Often, (i) W is defined the way that the set of edges having non-zero 
weight corresponds to a directed acyclic graph, and (ii) A^ only depends on 
the net input and/or the external input, i.e. A^^ : R^ — > R, s.t. au{t -h 1) = 
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Au{netu{t -h l)^ex{u)). If an ANN satisfies (i) we call it ‘layered^ if it satisfies 
(ii) we call it Hnput-driven\ Let us furthermore call an ANN ^binary \ if for 
every G C/: Ou is a mapping from M to {0, 1}, ex{u) G {0, 1}, and if ex{u) — 1 
then also Ou{t -h 1) = 1 (for t > 0). 

An ANN is a dynamical system in which an initial network state is 
transformed under the influence of the external input. We only consider ANNs 
defined on a discrete time scale. Moreover, we assume the initial state of an 
ANN to be given by the output states o^x(O) for arbitrary u e U. More usually, 
the initial net state is given by the activity states a^^(0), but for our purposes 
the output states are more adequate. 

Now we will show correspondence results for a class of ANNs and the 
class of FHINs. We avoid stating the formal details of how to explicitly construct 
an FHIN from a given ANN, and vice versa, since the technicalities are rather 
awkward. Instead, we will only sketch the constructions: 

Theorem 179 

Every binary, layered, and input- driven ANN may be simulated by an 
FHIN, s.t. every output state of the ANN corresponds to a state s of the 
FHIN, every external input ex to the ANN is translated to an input state s* 
of the FHIN, and the closure of ex ( which may be defined analogously to the 
definition for FHINs) is translated to the closure of s* . 

Proof: 

All units of the given ANN are also used as nodes in the FHIN to be 
constructed. The only observation we need for the construction is that (i) if 
the external input to a unit in the ANN is \, its output state is also set to 
1, and (ii) if the external input to a unit in the ANN is 0, the output state 
of the unit is the image of a Boolean mapping the arguments of which are the 
outputs states of other units. This is the case because we have presupposed the 
ANN to be binary. By theorem ISf we may thus replace the weighted edges 
leading to an arbitrary unit u ^ U by an inhibition subnetwork computing the 
very Boolean mapping that is associated with u in the case of lacking external 
input. In order to do so we may have to add nodes, but this does not matter. 
Since the given ANN is supposed to be layered, the resulting inhibition net is 
hierarchical, and thus an FHIN. The external input function corresponds to an 
input state, the initial output state to an initial state in the FHIN. Therefore 
the given ANN and its associate FHIN have the same closure of the input. In 
general, the so- constructed FHIN has more nodes than the given ANN. ■ 

By theorem 179 every result on inhibition nets in sections 2 and 3 
is applicable to binary, layered, and input-driven ANNs. Now for the other 
direction: 
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Theorem 180 

Every FHIN may be simulated by a binary, layered, and input- driven 
ANN, s.t. every state s of the FHIN corresponds to an output state Ou of the 
ANN, every input state s* o/ the FHIN is translated to an external input ex to 
the ANN, and the closure 0 / 5 * is translated to the closure of ex. 

Proof: 

All nodes of the given FHIN are also used as units in the ANN to 
be constructed. The inhibition of edges employed in inhibition nets may be 
simulated by the inhibition of inter-nodes in ANNs: for each inhibitory line 
{ni, (n 2 ,ns)) in the given FHIN add a further inter-node n in the ANN, re- 
place (ni, by the inhibitory line {n\,n) having the negative weigth —I, 

while simultaneously replacing the excitatory connection (n 2 ,ns) in the FHIN 
by two excitatory lines (n 2 ,n) and (n,ns) both having positive weight 1. All 
other excitatory connections in the FHIN are replaced by connections with pos- 
itive weight 1. For each unit we choose weighted summation as the net input 
mapping, the external input is identified with the (arbitrarily) given input state, 
and we set the activation mapping of each unit to a threshold function, s.t. the 
activation state of the unit is 1, if the external input to the unit is 1 or the net 
input of the unit is larger than 0. For every unit we define the output mapping 
to be the identity function. Since the states of nodes in an FHIN are binary, and 
since FHINs are hierarchical, we get a binary, layered, and input- driven ANN 
having the same closure of the input state as the given FHIN. The resulting 
ANN has generally more nodes than the FHIN. ■ 

Theorems 179 and 180 state that FHINs and binary, layered, input- 
driven ANNs are indistinguishable concerning their dynamics. In analogy to 
(partially) interpreted inhibition nets we can speak of a (partially) interpreted 
binary, layered, and input- driven ANN, which may be defined in the way of 
def.153. By theorems 179 and 180 also interpreted ANNs and interpreted in- 
hibition nets are intertranslatable, by using the interpretation mapping of the 
given network as the interpretation mapping of the translation of the given 
network. But as the proof of theorem 179 shows, the translation of an inter- 
preted ANN into an interpreted inhibition net will generally result in a partially 
interpreted inhibition net, since the set of nodes may have to be enlarged. The 
same holds for the transition into the opposite direction (according to the proof 
of theorem 180). 

Every interpreted ANN with the properties from above again satisfies 
a conditional theory, and for every consistent conditional theory there is an 
interpreted ANN with such properties, which satisfies the theory. For this rea- 
son we may call CL sound and complete with respect to a net semantics for 
artifical neural networks of an important kind. If the weights of the edges of an 
interpreted ANN are altered by means of a learning mechanism, also the corre- 
sponding conditional theory is altered. In this way an interpreted ANN might 
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learn a conditional theory, i.e. a set of conditional beliefs might be acquired by 
the network. The conditional beliefs of an interpreted ANN are represented by 
the weights of its connections. 

The results of chapters 18 and 19 also show that finite, normal, hier- 
archical logic programs and binary, layered, input-driven ANNs are intertrans- 
latable, as far as their closure dynamics is concerned. 




Chapter 20 
DISCUSSION 



We have shown in chapter 15 that we can ascribe singular beliefs, general be- 
liefs, and nonmonotonic inferences to cumulative-ordered net agents in such way 
that the definitions of part I, where we have explicated the notion of inference, 
are satisfied. At the end of chapter 17 we have answered the questions that we 
have raised at the beginning of part IV: (i) is there a low-level agent which is 
ideal in the sense of part II? The answer is: yes. (ii) What does the cognitive ar- 
chitecture of such an ideal agent look like? The answer is: one possible low-level 
implementation of an ideal agent is a cumulative-ordered interpreted inhibition 
network agent, (iii) How may the typical properties of justified nonmonotonic 
inferences be implemented, i.e., the nonmonotonicity effect, the “optimum in- 
stability” , the closure properties, and the specifity sensitiveness? The answer is: 
one possible implementation is by the combination of excitatory and inhibitory 
connections in a network. 

Our ideal agent is a simple connectionist-like network agent, she only 
draws justified nonmonotonic inferences, and she does so in a feasible way, i.e., 
an inference takes the number of network layers times the unit time interval. 
Our results have been derived by making use of the metalogical results that 
we have reported in part III, the results on the dynamics of finite hierarchical 
inhibition nets in chapter 14, and the representation theorem in chapter 16. 
Extensions of these results are to be found in chapters 24-27 of the appendix, 
where we show analogous representation theorems for the systems P, C, CM, 
M and corresponding classes of interpreted inhibition network agents. 

We are now going to focus on some possible arguments against some 
of the assumptions on the basis of which these results have been derived. The 
bullet-item claims express the objections: 

• We have described our cognitive net agents incompletely, since we have 
omitted (a) a specification of how the external inputs to their perceptual 
systems are transformed into perceptual belief patterns, (b) a specifica- 
tion of their action systems, and (c) also a specification of how the central 
systems have to be connected to the action systems in order to ensure an 
“appropriate” manifestation of the agents’ beliefs. Perhaps there are still 
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no ideal low-level network agents despite the results from above, since the 
necessary details concerning the perception and action systems cannot be 
supplemented in a satisfying way. 

Although this complaint about the lack of specification is perfectly jus- 
tified, there does not seem to be a good reason to believe such a specification 
to be impossible. If such a specification is possible for symbolic computation 
agents - which we take for granted - i.e., if it is possible to implement causal 
pathways from symbolic knowledge bases to effectors in such a way that the 
presence of a formula a in the knowledge base entails actions that are appro- 
priate for the environments in which a is true, then why should this not also 
be possible given the presence of an activation pattern 3^’^ (a) in a network? 

• The definitions of interpreted inhibition network agents (def.139) and of 
cumulative-ordered interpreted inhibition network agents (def.153) define 
abstract interpreted systems that cannot be realized by concrete systems 
in the real world, since the interpretation mappings 3^ and (the 
“psycho” -semantics of the net agents) are unrealizable. E.g., in the case 
of cumulative-ordered net agents, we have demanded that for all (^9, G £, 
if T'Hjc,o \- (p ^ then 3^^^ (if) D but how should a concrete 

network agent satisfy such a constraint with respect to the infinitely many 
pairs of formulas (f.'ipoiCl Perhaps this is where the combinatorial open- 
endedness of a symbolic computation architecture is need. 

As we have pointed out at the end of section 15.2, we might regard 3^ 
and 3^^^ as partial mappings, s.t., not for every p ^ C there is an associated 
pattern 3^{(p) or 3^^^{(p) of activation in our networks. E.g., it might be the case 
that just, say, five formulas of C have such associated belief patterns - however, 
the results from above would still be valid, but now just for the five “neurally” 
interpreted formulas. The postulate on 3^^^ above would read like this then: 
for all (f, ip e C, if TTY^c.o \- (p ^ ^|J then 3^^^{p) 5 given that 3^^^{(p) 

and 3^'°{'ip) are defined. Since C is assumed to be based on only finitely many 
propositional variables, there are also only finitely many propositions (set of 
worlds) that are expressed by sentences of C. It is actually these propositions 
that have to be represented by the belief states of our network agents, since 
propositions are usually considered as the contents of beliefs according to the 
folk-psychological theory of beliefs, and not sentences. It is also these proposi- 
tions that correspond to the patterns of activation in the perceptual and central 
systems of the interpreted network agents: the larger a pattern is, the logically 
stronger is the proposition that it represents; conversely, the smaller a pattern 
is, the weaker is the proposition that it represents (compare the terminology of 
one formula being stronger or weaker than another with respect to the net se- 
mantics which we have introduced at the beginning of chapter 16). Such a form 
of representation complements the following property of belief states that we 
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have expressed on p.25 of chapter 3: “beliefs have an internal “structure” , which 
somehow mirrors the internal structure of the proposition that is the content 
of the belief, and/or that mirrors the syntactical structure of the sentence that 
expresses the content of the belief.” Of course, the second part is abandoned 
now, but the first one remains. This provokes the following counter-reply to the 
objection above: if propositions are the contents of belief states, why should an 
agent represent propositions indirectly via a detour through syntactical repre- 
sentations, instead of representing the propositions directly? E.g.: we usually 
assume that a A /3 expresses the same proposition as (3 A a. Why should a 
low-level agent use a “too fine-grained” format of representation by which one 
is able to distinguish many (potentially infinitely many) representations which 
express precisely the same propositions (just as a A /? and j3 A a do), if the 
propositions are the relevant entities and not the formulas that express them? 

• As emphasized in section 8.6, ideal agents are not by themselves also 
powerful; e.g., they may indeed draw only justified inferences, but this 
might be the case just because they do not draw many such inferences, 
and no practically relevant ones. Perhaps, the net agent ^ideai of chapter 
17 is ideal but not powerful. 

In fact, the agent ^ideai is extremely powerful, because her interpre- 
tation mappings have been assumed to be total functions, s.t. she even draws 
every possible justified nonmonotonic inference from a given total perceptual 
premise belief. If the interpretation mappings were not supposed to be total, 
the power of ^ideai would be directly proportional to the power of her repre- 
sentational capacities, which may still be vast. 

• The results indeed prove that ^ideai is ideal and powerful, but that still 
does not show that connectionist network agents may be ideal agents, 
because interpreted inhibition net agents are no connectionist architec- 
tures at all; they lack weights, continuous activation states, real- valued 
propagation functions, etc. Perhaps ^ideai is just a symbolic computation 
agent “in disguise” . 

Interpreted inhibition network agents are dynamical systems with a 
network structure; the kind of representations that we have used is distributed 
representation. Chapter 18 proves that inhibition nets are not so different from 
{not network-like) symbolic systems, as one would prima facie suppose; but 
we have also seen that interpreted inhibition nets differ from the correspond- 
ing symbolic implementations concerning representation. Moreover, chapter 19 
shows that inhibition nets and typical artificial neural networks do not differ 
essentially with respect to their dynamics. We might just as well have used an 
artificial neural network (with weights etc.) as the implementation of a powerful 
ideal low-level agent. 
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• In order to derive the results above we have presupposed that ^ideai is 
incapable of inductive reasoning, since her dispositional states, and thus 
her general beliefs, cannot be altered. But this restriction to primary 
justification is both unrealistic and trivializing, since inductive reasoning 
is an essential feature of natural (and maybe also artificial) agents, and 
a proper ideal agent would be one whose general beliefs are created by 
justified inductive reasoning processes, and whose inferences would thus 
also have secondary justification. Perhaps there is no ideal low-level agent 
- in particular, no ideal low-level network agent - with the capacity of 
justified inductive generalization. 

This issue has to be left open. As we have pointed out in subsection 
6.3.2 of part II, we do not have a satisfying definition of justified inductive rea- 
soning processes, and even if we had one, we do not know whether a low-level 
(network) agent would be able to reason justifiedly according to such a defi- 
nition. However, there are two good indications that low-level network agents 
might not be excluded by a theory of justification for inductive reasoning: (i) 
chapter 19 shows that there is an artificial neural network which is dynami- 
cally equivalent to ^ideai- We might also use this artificial neural network as 
our intended ideal agent. But artificial neural networks are probably the most 
promising implementations of learning architectures that we currently know of; 
learning processes constitute the first kind of component processes of inductive 
reasoning processes (see subsection 6.3.2). (ii) Monotonic and nonmonotonic 
reasoning processes from general beliefs to further general beliefs constitute 
the second type of component processes of inductive reasoning processes (see 
subsection 6.3.2 again). By the soundness lemma 165, monotonic and non- 
monotonic reasoning processes are superfluous in interpreted inhibition nets 
or artificial neural networks, since the general beliefs of the latter systems are 
automatically closed under the rules of CL just by the way the contents of the 
general beliefs are represented. 

These are just clues which indicate that the existence of ideal low-level network 
agents with the capacity of inductive reasoning is not impossible apriori. But 
we do not have anything more substantial to say about that. 

Not surprisingly, a study like this has to end at the ultimate borderline of 
induction. 
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Chapter 21 

DIGRESSION ON STATES, DISPOSITIONS, CAUSATION, 

PROCESSES 



In this chapter we will sketch how the theory of belief and inference developed 
in chapters 3 and 4 might be supplemented by stating formal definitions for 
the basic terms that have been left undefined, or that have been defined only 
on an informal level: among those are ‘(mental) states’, ‘occurrent states’ and 
‘dispositional states’, ‘being disposed to change to’, ‘being disposed to remain 
in’, ‘belief states’, and finally ‘(mental) processes’. This chapter is thus a kind of 
ontological addendum. Before we can give the definitions, we have to reconsider 
some of our assumptions on the cognitive agent T, and we have to add some 
new ones. 

21.1 The System Assumptions 

We start with the following one: 

• (System Assumption on A) 

A is a system of parameters evolving in discrete time steps t = 0, 1, 2, . . . 

This has been assumed just for simplicity. All of what follows might 
also be developed for systems evolving in continuous time, e.g., for systems 
defined by differential equations. But discrete time simplifies various aspects of 
the theory. 

Furthermore, we assume: 

• (Assumption on A’s Subsystems) 

A consists of three subsystems: 

1. the perceptual system 

2. the central system 

3. the action system. 
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Each subsystem is again a system of parameters; at each time the 
parameters of the subsystems are set in some way. An assignment of values 
to the parameters of a subsystem is also called a ‘parameter-setting’ (of the 
relevant subsystem). The parameter-setting of the whole system is uniquely 
specified by the parameter-settings of the subsystems taken together. 

The perceptual system produces perceptual beliefs as its output, the 
central system produces, among others, central state beliefs and inferences, and 
the action system produces actions. 

As a next step, we have assumed that A’s central system consists of 
two further subsystems which differ concerning their long-time behaviour: 

• (Assumption on A’s Central System) 

A’s central system consists of two subsystems: 

1. the occurrent system of the central state parameters, which normally 
change fastly and abruptly; call this the ‘occurrent central system’ 

2. the dispositional system of the central state parameters, which nor- 
mally change slowly and gradually; call this the ‘dispositional central 
system’. 

If we are given an arbitrary system of parameters, it may be the case 
that there is no obvious way of partitioning its set of parameters into those 
belonging to an occurrent subsystem and those belonging to a dispositional 
subsystem. Our assumption consists in the supposition that such a partition is 
possible in the case of A’s central system, and that such a partition has been 
set up. More generally, this might also be put as follows: we assume A to be a 
system which is describable in the way as it is done within this chapter. 

We associate with every parameter-setting 5^’^ of the dispositional cen- 
tral system the way (i) in which a parameter-setting of the perceptual system 
together with a parameter-setting of the occurrent central system leads to a new 
parameter-setting of the central system under the condition that is the cur- 
rent parameter-setting of the dispositional central system, and the way (ii) in 
which a parameter-setting of the perceptual system together with a parameter- 
setting of the occurrent central system leads to a new parameter-setting of 
the action system, again given that is the current parameter-setting of the 
dispositional central system. Put formally, this amounts to: 

Definition 181 (Systems) 

By a system Sys we mean a tuple ^ , nc^ na) , s.t. 

1. ^ 0 is the set of possible parameter- settings s^ G of the perceptual 

system of A 
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2. 5^’^ ^ 0 is the set of possible parameter-settings G 5^’^ of the occur- 
rent central system of A 

S, 5*^’^ ^ 0 is the set of possible parameter- settings s^^^ G of the dispo- 
sitional central system of A 

X is the set of possible parameter- settings s^ = s^’^) G 

5^ of the central system of A 

5. ^ 0 is the set of possible parameter- settings s^ G 5“ of the action 
system of A 

6. nc : 5"’^ ^ {/ 1/ : x 5"’^ ^ S^}, s.t 

nc{s^^^) is a mapping from x to S^, and nc(s‘^’^)(s^, is the 
next (i.e., at time t 1) parameter- setting of the central system of A if 
s^ has been the previous parameter- setting of the perceptual system of A 
and has been the previous parameter-setting of the occurrent central 
system of A (‘previous’ here means: at time t); nc(s^’^) is associated with 
sc4 

1. na : ^ {/ 1/ : x 5"’^ ^ }, s.t 

na{s^'‘^) is a mapping from x to S^, and na{s^^^){s^ is the 
next (i.e.y at time t 1) parameter- setting of the action system of A if 
s^ has been the previous parameter- setting of the perceptual system of A 
and has been the previous parameter- setting of the occurrent central 
system of A (‘previous’ means again: at time t); na(s‘^’^) is associated 
with 5^’^. 

E.g. in a connectionist network, the spreading nc{s^^‘^) of activity under 
a certain input s^ from the perceptual subsystem and under a certain activity 
setting of the nodes, is determined by the weigths of the edges and by 
the topology of the network, i.e.: by the central dispositional parameter setting 

The set S of all parameter-settings of a system Sys is the set of all 
sextuples s = ( 5 ^, 5 ", 720 ( 5 *^’^), na(s'^'’^)), s.t. s^ G 5^, 5^’® G 

sc,d ^ G 5®. 5 is the ‘parameter-setting space’ of Sys. 

We have: 

• (Assumption on A’s Being a System) 

A is a system as defined above. 

We are going to use the phrases ‘the system A’ and ‘the system of A’ 
interchangeably. 
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Furthermore we assume that ^’s perceptual system receives external 
inputs, and that every such input to the perceptual system determines precisely 
one parameter-setting of A^s perceptual system. This is the external causal 
dynamics of the system A. The internal causal dynamics of the system A is 
the evolution of central state parameter-settings determined by the current 
parameter-setting of the perceptual system and the current parameter-setting 
of the central system. The external and the internal dynamics together may be 
represented by a mapping (for given sP^^): 

Let : S' ^ S', s.t. for all parameter-settings 



s = 



p c,o c,d a 
^old’> ^old’> ^old^ ^old'> 



■= { 



qP qC,0 

'^new 1 "^new ’ 



^ 'S' at time t: 

SnL> Snew,nc{sLw)^ «a(sne^«)) where 



1. is the parameter-setting of the perceptual system at ^ + 1 is 

determined “from outside”) 



2- (s^new^s^Jw) = *^6 parameter-setting of the central 

system at t + 1 

^new — parameter-settiug of the action system 

at t -h 1 

4. nc{s^^^) is the mapping associated with which defines the next cen- 
tral state-parameter-setting under a new parameter-setting of the per- 
ceptual system and the new parameter-setting of the occurrent central 
system 



5. Tia{s^^^) is the mapping associated with which defines the next 
action-parameter-setting under a new parameter-setting of the perceptual 
system and the new parameter-setting of the occurrent central system. 

state-transltlon function of the system (5^, 5"^, 
S^,nc,na) of A which is only given relative to the next parameter-setting of 
the perceptual system is the only parameter which is not determined by 

the previous parameter-setting s oi A ^ since is determined by the causal 
influence of the environment on A). 



Let us now assume that the parameter-setting of A’s perceptual system 
remains constant (say, identical to a fixed s^) for some amount of time. Then the 
dynamics of A is defined as such: for each s = nc{s^^^), na{s^'^)) 

G S the iterated application of FgP defines the following trajectory of parameter- 
settings: 

t = 0 : s 
t = l:FsP (s) 

t = 2:FUs)=F,p{Fsp(s)) 
t = 3:Fi,{s) = Fsp{F,p{F,p{s))) 
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Fgp (5) is the parameter setting of A at time k given that s has been the initial 
parameter-setting at time 0, and given a parameter-setting 5 ^ of A's perceptual 
system which is considered to be constant for a sufScient amount of time. 
(5, Fgp) is a so-called discrete dynamical system. ((5, Fsp))^p^^p is a family 
of discrete dynamical systems. Dynamical systems such as (5, Fgp) are called 
‘discrete’ since they evolve in discrete temporal steps of unit duration. 

If S is finite, then a system as defined above is essentially a so-called 
finite automaton or finite state machine (see Arbib[7], p.8). 

The parameter-settings of A’s action system evolve in parallel to the 
evolution of the parameter-settings of A’s central system. Each parameter- 
setting of A’s action system may be thought of as causing some action, or as 
leading to inactivity. 

In the following sections, let (5^, 5^, 5®, nc, na) be the sys- 

tem of A. 

21.2 States 

Using these assumptions we can now define various notions of states, where 
a state is the extension of some property of parameter-settings of the whole 
system, or of some subsystem; 

Definition 182 (A's States) 

1. A state X (of A) is a set of parameter-settings (for A), i.e. X C S; the 
powerset p(S) of S is thus the set of states 

2. let s e S: 

A is in s in the state X iff s ^ X 

3. let s{t) e S he the parameter- setting of A at time t: 

A is at t in the state X iff s{t) E X 

4 . a perceptual state X^ (of A) is a set of parameter- settings of the perceptual 
subsystem (of A), i.e. X^ C 5^; p{S^) is the set of perceptual states 

5. let sP e SP: 

A is in sP in the perceptual state Xp iff sP G Xp 

6. let sP{t) G SP he the perceptual parameter- setting of A at time t: 

A is at t in the perceptual state Xp iff sP{t) G Xp 

7. accordingly for central states X^ C S^, and action states X^ C 5®. 
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A perceptual state is thus a state of the perceptual subsystem, a central 
state a state of the central subsystem, and an action state a state of the action 
subsystem. We will use ^X\ W’ (with or without indices) as variables for states. 

As indict ated in section 3.3 for perceptual beliefs, the perceptual states 
of A are occurrent states. Action states could also be regarded as occurrent 
states, but this is not important for the subsequent considerations. In the case 
of central states the situation is more complex: 

Definition 183 (A^s Occurrent Central and Dispositional Central States) 

1. An occurrent central state (of A) is a set of parameter- settings of 

the occurrent central system (of A), i.e. C 5^’^; is thus the 

set of occurrent central states 

2. a dispositional central state X^^^ (of A) is a set of parameter- settings of 

the dispositional central system (of A), i.e. C is thus 

the set of dispositional central states. 

The notions of ^being in a state in a parameter- setting’ and ‘being in a 
state at a time ’ are defined for occurrent central states and for disposi- 
tional central states just as in def. 1 82. 

Note that we can associate with every dispositional central state X^'^^ 
the image sets nc(A^’^) and na(A^’^) of mappings] indeed, we could have even 
defined dispositional central states as sets of functions and not of parameter- 
settings. We have not done so just in order to preserve the analogy to occurrent 
central states, and because we have associated the mappings na(s‘^’^) 

with the parameter-settings anyway. 

We can also define the notions of occurrent and of dispositional state: 

Definition 184 (A’s Occurrent and Dispositional States) 

Let = {( 5 ^, 5 ^) \3s e S, s.t. s = s®, nc(s^’^), na(s^’^)) } 

(S^ is the set of parameter- setting of the occurrent subsystem of A that consists 
of A’s perceptual subsystem and A’s occurrent central subsystem). 

Let is the set of parameter- setting of the dispositional 

subsystem of A that is identical to A’s dispositional central subsystem): 

1. an occurrent state X^ (of A) is a subset of ; p{S^) is the set of occurrent 
states 

2. a dispositional state (of A) is a a subset of , i.e., a dispositional 
central state (of A); p(S'^) is the set of dispositional states 

Thus, occurrent states are states of a subsystem of A that consists of 
both the perceptual system and the occurrent central system of A. 

If we consider a central state X^ CS^, we can associate with A"^ a 
certain occurrent and a certain dispositional central state: 
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• (The Occurrent and the Dispositional Parts of a Central State) 

Let 

Occ(X^) — |there is an G 5 ^’^, s.t. 5^’^) G }, and 

Dis{X^) = {5^’^ |there is an 5 *^’^ G s.t. (5*^’^, G X^ }. 

Occ(X^) might be called the ‘occurrent part’ of the central state 
Dis{X^) the ‘dispositional part’ of the central state X^. 

Formally, Occ{X^) is the left projection of X^, while Dis{X^) is the 
right projection of X^. 

Perceptual states, central states, occurrent central states, dispositional 
central states, action states, occurrent states, and dispositional states are states 
of 5 i£ 6 systems of A, But we may also use the following derived notions by which 
the latter states are put into correspondence to states of the whole system: 

Definition 185 (Counterparts) 

Let s = ( 5 ^, na(s^’^)) G S: 

A is in s in the perceptual state X^ iff G X^ . 

The perceptual state X^ thus corresponds to the state 
Cp{XP) = {s \ A is in s in the perceptual state X^} of the whole system: call 
Cp{X^) the ^counterpart state’ of X^ on the level of the system A (and anal- 
ogously for central states, occurrent central states, dispositional central states, 
action states, occurrent states, dispositional states; e.g., A is in s in the central 
state X^ iff G X^, and Cp{X"^) = {sjA is in s in the central state 

X-}). 



One state may be defined to be a substate of another, if, whenever the 
system is in the latter state, it is also in the former state; i.e.; 

Definition 186 (Substates) 

A state X (of A) is a substate of a state Y (of A) iff Y C X . Equiva- 
lently, we may call Y a ‘superstate ’ of X 

( analogously for perceptual substates, central substates, occurrent cen- 
tral substates, dispositional central substates, action substates, occurrent sub- 
states, dispositional substates). 

21.3 Disposition Ascriptions 

Now we can define properly what it means to ascribe dispositions to A (as we 
have done in part I) , by which A is disposed to change to a state under certain 
circumstances, or by which A is disposed to remain in a state under certain 
circumstances: 
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Definition 187 (A’s Basic Disposition Ascriptions) 

1. Let G let X, Y be states: 

A is disposed in s^'^ to change (after the amount Ate of time) to Y given 
X 

( and given that the perceptual input to A is constant for the amount Ate 
of time) iff 

for alls eX, s.t. 5 = 

Fsp{Fsp{...{Fsp{Fsp{s)))...))eY 

'' V ' 

Ate times 

2. let s ^ (sP,s'=’°,s‘=’‘',s“,nc(s'=’<^),na(s'=’‘^)) e S', let X, Y be states: 

A is disposed in s to change ( after the amount Ate of time ) to Y given 
X 

( and given that the perceptual input to A is constant for the amount Ate 
of time) iff 

A is disposed in s^'^ to change ( after the amount Ate of time ) to Y given 
X 

( and given that the perceptual input to A is constant for the amount Ate 
of time) 

S. let s^'^ G 5^’^, let X, Y he states: 

A is disposed in s^^^ to remain (for the amount Ate of time) in Y given 
X 

( and given that the perceptual input to A is constant for the amount Ate 
of time) iff 

for all s £ X nY , s.t. s = s“, nc{s^^^),na{s^^^)): 

there is an amount At of time, s.t., after At state-transitions, the system 
remains for (at least) Ate state-transitions within Y, i.e. 

Fsp{Fsv(Fgv{. . . {FsP {FsP {s))) . . .))) G Y , 

At times 

F,,{Fs.{Fs.{FsA- ■ ■ {Fsp{FsAs))) ■ . .)))) G Y, 

At times 

Fs.{FA- ■ • . . (F3p(F,p(s))) . . .))))) • • •)) G F 

'' V ' ' V ' 

Ate times At times 
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4- let s = (5^,s^’^,s‘^’^,s“,nc(5^’^),na(5^’^)) G 5, let X, Y be states: 

A is disposed in s to remain (for the amount Ate of time ) in Y given X 

(and given that the perceptual input to A is constant for the amount Ate 
of time) iff 

A is disposed in s^^^ to remain (for the amount Ate of time) in Y given 
X 

( and given that the perceptual input to A is constant for the amount Ate 
of time). 

In the above definition of ‘A is disposed in to remain . . we have 
allowed for a time At after which the systems remains in the specified state. 

The last definition has the following obvious consequence: 

Corollary 188 

If A is disposed in s to remain (for the amount Ate of time ) in Y given 
X, 

then A is disposed in s to remain (for the amount Ate of time) in Y 
given X f^Y . 

21.4 Belief States 

As explicated in chapter 3, beliefs are certain mental states with propositional 
contents. Furthermore, beliefs are states of the whole system A, not of one of 
her subsystems: e.g., also perceptual beliefs have - by explication - an action- 
guiding function, and therefore they cannot be regarded as states of the percep- 
tual system, since the latter is not able to initiate actions without the central 
system or the action system. Beliefs may be perceptual beliefs, or central state 
beliefs, or both. Furthermore, beliefs may be occurrent, or dispositional, or 
both. Finally, beliefs may be total or not. Let us now make use of our account 
of states in order to improve our original account of beliefs: 

We start with the perceptual beliefs. A perceptual belief of A is a state 
of the system A in which A perceptually believes that something is the case. 
Here, ‘is a perceptual belief of A’ is a general term. The perceptual belief of 
A that [(/? is true] - or: A’s perceptual belief that [(p is true] - is the state of 
the system A in which A perceptually believes that [(f is true] . The perceptual 
belief of A that [ip is true] is thus the largest state of the system A in which A 
perceptually believes that [(p is true] . Here we refer to a certain state by means 
of a singular term. It is part of our explication of the notion of perceptual belief 
that whether A has in s = nc(s‘^’^), na(s‘^’^)) the perceptual 

belief that [ip is true] or not, is independent of the parameter-settings of the 
central system or the action system, but only depends on the parameter-setting 
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of the perceptual system, i.e., on s^. For every (f e C oi (p E U (for 
some implication signs ^ and =^), A's perceptual belief that [(p is true] thus 
has the following property: there is a perceptual state C s.t. ^’s 

perceptual belief that [p is true] - say, 3{bPp) - is just Cp{3~ {b'^p))^ i.e., the 
counterpart state of 3~{bPp) on the level of the whole system (recall def.185). 
^’s perceptual beliefs are therefore not perceptual states themselves, but they 
are counterparts of perceptual states on the level of the whole system. Let us 
put this more formally: 

• (The Explication of A^s Perceptual Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 of chapter 2 by 
all expressions of the form bPp, where p E C qi p E C-^ U (for 
some implication signs ^ and =^), and we consider J as a mapping on 
the extended domain, s.t. for every p E C ov p E U (for some 
implication signs — ^ and there is a unique perceptual state 3~{b^p), 
s.t. 3{Ifp) = Cp{3~{bPp)), where 3{1fp) is A's perceptual belief state 
that [p is true]. 

We say that s \= B^p for s = 710 ( 5 ^^’^), na(s^’^)) G S' if 

and only if s G 3{bPp). 

A perceptual belief of A is any such state 3(b^p) for some p E C ov 
p E U (for some implication signs and =>). 

Note that 3~{bPp) may of course be empty - in such a case also 
3{bPp) = Cp{3~ {bPp)) = Cp{0) = 0 would be empty; the same holds for 
the following definitions. Moreover, it is not excluded that 3{lfp) = 3 {bP^l)) for 
p ^ m such a case, A’s belief state 3{bPp) (or 3{bP'ip)) would not only have 
the proposition that [p is true] as its content, but also the proposition that [i/j 
is true]. 



Accordingly, we can define occurrent central state beliefs and disposi- 
tional central state beliefs: 

• (The Explication of A’s Occurrent Central State Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form where p E C or p E U (for some implication 

signs ^ and =^), and we consider IT ais a mapping on the extended domain, 
s.t. for every p E C or p E U (for some implication signs ^ and 
=^) there is a unique occurrent central state 3~{b^^^p), s.t. 3{b^'^p) — 
Cp{3~ {b^^^p)), where 3{b^'^p) is A’s occurrent central belief state that 
[p is true]. 

We say that s B^^^p for s = nc(s^’^), na(s^’^)) G 5 if 

and only if s G 3(b^'^p). 
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An occurrent central state belief of A is any such state for some 

(f ^ C or (f E U (for some implication signs and =>). 

• (The Explication of A’s Dispositional Central State Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form where ip E C or if E U (for some implication 

signs ^ and ==>), and we consider J as a mapping on the extended domain, 
s.t. for every f E C or f E VJ (for some implication signs — ^ and 
=^) there is a unique dispositional central state s.t. 3{b^'^f) = 

Cp{3~{b^'^f))^ where 3{b^^^f) is A’s dispositional central belief state that 
[f is true]. 

We say that s N B^'^f for s = nc(s^’^), na(s^’^)) G 5 if 

and only if s E 3{b^^^f), 

A dispositional central state belief of A is any such state 3{b^'^f) for some 
f E C or f E U (for some implication signs and =^). 



Now let us turn to central state beliefs (simpliciter) . We regard the 
notion of A’s central state belief that [f is true] as being derived from the 
notions of an occurrent central state belief and of a dispositional central state 
belief having the content expressed by f: 

• (The Explication of A’s Central State Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expres- 
sions of the form b^f, where f E C or f E C-^ U (for some im- 
plication signs — ^ and =^), and we consider 3 as a mapping on the ex- 
tended domain, s.t. for every f E C or f E C-^ U (for some im- 
plication signs and =>): 3{b''f) = Cp{3~ {b^^f)), where 3{b^f) is A’s 
central belief state that [f is true], and 3~{b^f) is a central state with 
3-{b^f) = { 5 ^ = G G a-(6^’V) or G 3~{b^^^f ) }. 

We say that s f= B^f for s — (s^, s“, nc(5^’^), na( 5 ^’^)) G 5 if 

and only if 5 G 3{b^f). 

A central state belief of A is any such state 3{b'^f)^ or an occurrent central 
state belief 3{b^^^f), or a dispositional central state belief 3~{b^^^f)^ for 
some f E C or f E U (for some implication signs — ^ and =>). 

This directly entails the following two corollaries: 

Corollary 189 

For all s E S, for all f E CU U C=^ (for some implication signs 

and =>): 

s 1= B^'^f or s \= B^'^f if and only if s\= B^f. 
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Corollary 190 

For all (f ^ CU U (for some implication signs and ): 

Obviously, Occ{3~ (b^(p)), i.e., the occurrent part of the central state 
3~{b^(p), is identical to 3~{b^'^(f)^ and Dis{3~ (b^cp)), i.e., the dispositional part 
of the central state 3~{b^p), is identical to 3~{b^'^(p). 

On the level of belief states, i.e., on the level of the states of the whole 
system A, we have: 3{b^^^(p) is a superstate of the central state belief 3{b^(p), 
and also 3{b^^^(f) is a superstate of 3{b^p); equivalently: 3{b^(p) is a substate 
of both 3{b^'^(p) and 3{b'^^^p). In the light of corollary 190, we call 3{b^(f) the 
‘superposition’ of 3{b^^^(p) and 3{b^^^(p). 

Note that if = 0 then 3{b^(f) = 3{b^^^p)^ and if 3{b^^^(f) = 0 

then 3{b^(p) = 3{b'^^^(p). 

Now we turn to occurrent beliefs - the notion of ^’s occurrent belief 
that [(f is true] may being derived from the notions of the perceptual belief 
and the occurrent central state belief having the proposition expressed by p as 
their content: 

• (The Explication of A's Occurrent Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form b^(p, where p e C ov p e C-^UC=^ (for some implication signs 
and =^), and we consider 3 as a mapping on the extended domain, 
s.t. for every p E C oi p E U (for some implication signs ^ 
and =>): 3{b^p) = Cp{3~ (b^p)), where 3{b*^p) is A’s occurrent belief 
state that [p is true], and 3~{b^p) is an occurrent state with 3~{b^p) = 
= (s^, G G 3~{bPp) or 5^’^^ G 3~{b^^^p ) }. 

We say that s 1= B^p for s = (s^, nc(s'^’^), na(s^’^)) G 5 if 

and only if s G 3{b^p). 

An occurrent belief of A is any such state 3{b^p)^ or a perceptual belief 
3{bPp)^ or an occurrent central state belief 3{b^^^p), for some p E C or 
p E U (for some implication signs and ^). 

So we have: 

Corollary 191 

For all s E S, for all p E CU U (for some implication signs 

and ^ ): 

s f= B^p or s\= B^'^p if and only if s\= B^p. 

Corollary 192 

For all p E CU U (for some implication signs and => ): 
3{b^p)^3{bPp)0 3(b^^^p). 
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3{bP(p), are superstate of the occurrent belief 3{b^(p)] equiva- 

lently: 3{b^(f) is a substate of both 3{l>P(f) and 3{b^^^(p). We express the content 
of corollary 190 by calling 3{b^(p) the ‘superposition’ of 3{bP(p) and 3{b^'^(p). 

A^s dispositional beliefs are - by explication - identical to A’s disposi- 
tional central state beliefs: 

• (The Explication of ^’s Dispositional Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form b^(p^ where (f e C or (f e (for some implication signs 

and and we consider J as a mapping on the extended domain, s.t. 
for every ip G C or cp G U (for some implication signs and ^): 
3{b^(p) = Cp{3~ {b^ip)), where 3{b^<p) is A’s dispositional belief state that 
[(p is true], and 3~{b^<p) is a dispositional state with 3~{b^(p) = 3~{b^'^(p). 

We say that s N B^(p for s = s", nc(5^’^), na(s*^’^)) G 5 if 

and only if s G 3{b^(p). 

A dispositional belief of A is any such state 3{b^(p) for some (p G C or 
ip G U (for some implication signs and =>). 

It follows that 3{b^(p) — 3{b^'^(p). 

The notion of A’s belief that [<p is true] may now be defined derivatively 
on the notions of A’s perceptual belief that [cp is true] and A’s central state 
belief that [(p is true], or, equivalently, on the notions of A’s occurrent belief 
that [(p is true] and A’s dispositional belief that [(p is true]: 

• (The Explication of A’s Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form b(p^ where (p G C or <p G C-^ UC^ (for some implication signs 
^ and =^), and we consider IT as a mapping on the extended domain, s.t. 
for every <p G C or <p G C-^ U (for some implication signs ^ and =>): 
3{b(p) = Cp{3~ {b(p))^ where 3{bip) is A’s belief state that [p is true], and 
3~{bp) is a state with 3~{bp) = {s = nc(s^’^), na(s^’^)) 

G 5 G 3{bPp) or G 3{b^p ) }, or equivalently, 3~{bp) = 

{s — G S\ G 3{b^p) or G 

We say that s N Bp for s = s", nc{s^'^) , na{s^'^)) G 5 if and 

only if s G 3 (bp). 

A belief of A is any such state 3 (bp), or an occurrent belief 3{b^p)^ or 
a dispositional belief 3{b^p)^ or a perceptual belief 3{b^p), or a central 
state belief 3{b^p), or an occurrent central state belief 3{b^'^p), or a 
dispositional central state belief 3{b^^^p), for some p G Cor p G 
(for some implication signs ^ and ^). 
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It follows that: 

Corollary 193 

For all s ^ S, for all (p e CU U (for some implication signs 

and ): 

s 1= B^(p or s\= or 5 1= B^p or s \= or s \= B^^^p if and only 

if s \=^ Bp. 

Corollary 194 

For all p ^ CVJ U (for some implication signs and => ): 

3{bip) = 

3{bP(f) U 3(6V) = 

3{bPip) U 3{b^’°<p) U 3{b'=’^(p) = 

a(6V)u3(V(^). 

3{b°ip), 3{b‘^(f), 3{bP(f), 3{b‘^’°ip), 3{b‘^’‘^ip) are superstates of 3{b<p) (and 
the substate relation holds in the opposite direction). By corollary 194, we may 
call 3{bp) the ‘superposition’ of 3{lFp) and 3{b^p), or of 3{b'Pp), 3{b^^^p), and 
3{b^^^p), or of 3{b^p) and 3{b^p). 

We have stated the explication of the notion of belief analogously to 
the previous cases of belief, but in the case of beliefs we could have actually 
abandoned since we have: 3{bp) = Cp{3~ {bp)) = 3~{bp). 

Note that we have not included those properties of beliefs stated in 
chapter 4 that have to do with the contents of beliefs (like, e.g.: if 5 N J5"(a ^ 
/3) then s \= B^ {a P))^ since the (partial) aim of this chapter is just a 
clarification of the notion of a state.* 

Analogously, we can now say more precisely what a total perceptual 
belief is, a total occurrent central state belief, a total dispositional central state 
belief, a total central state belief, a total occurrent belief, a total dispositional 
belief, and a total belief - we simply apply the informal explication on p.33: 

• (The Explication of A’s Total Perceptual Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form tlFp, where p ^ Cor p e C^UC^ (for some implication signs 
and =>), and we consider 9 as a mapping on the extended domain, s.t. 
for every p ^ C or p E C-^ U (for some implication signs — ^ and =>): 
3{tb^p) (i.e., A’s total perceptual belief state that [p is true]) is the set 
of parameter-settings 5, s.t. 

*But it is clear that those properties could be formulated within the theory that we have 
developed in this chapter: e.g., in order to ensure that if s t= B{a (3) then s 1= B^{a — ^ (3), 
we would have to postulate that S only contains parameter-settings that actually meet this 
condition. 
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1. 5 G 3{bP(f), and 

2. for all ?/; G £ U U (for some implication signs and =^): if 
s G 3{hP'ijj) then 3{b^'ip) is a substate of 3{bP(f) 

We say that s N AB^(f for s = ( 5 ^, 5^’^, 5®, nc(s^’^), na(s‘^’^)) G 5 if 

and only if 5 G 3{tbP(p). 

A total perceptual belief of A is any such state 3{tbP(p) for some (p e C 
01 p e C-^ U (for some implication signs — ^ and ^). 

It is easy to see that, equivalently, 3{thPp) might be defined as the set 
3{bPp) \ [j{3 {If G £ U U for some and => , s.t. 3{lf'ip) ^ 
3{bPp)} (an analogous equivalent definition can be given for all other kinds of 
total beliefs). 

Note that 3{tJfp) may of course also be empty, and that it might 
be very well the case that there is no s.t. 3{tlfp) = 3{lf'ip), i.e., a total 
perceptual belief is not necessarily a perceptual belief (and analogously for all 
other kinds of total beliefs). We can also define 3~ {tb^p) as the set of perceptual 
parameter-settings s.t. G 3~{bPp), and for all -0 G CUC^UC^ (for some 
implication signs and =>): if G 2^“(5^'0) then 3~{lf'ip) is a substate of 
3~{b^p). 3~{tbPp) is then a perceptual state, and one can show that 3{tWp) = 
Cp{3~ {tlfp)). Analogous remarks can be made for all the notions of total 
belief to be considered in the following, and 3~ (tb^'^p)^ 3~ {tb'^^^p)^ 3~{tb^p)^ 
3~{tb^p)^ 3~{tb^p), and 3~{tbp) {= 3{tbp)) can be defined analogously. 

• (The Explication of A’s Total Occurrent Central State Beliefs Reconsid- 
ered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form tb^'^p, where p e C or p e U (for some implication 
signs — ^ and =^), and we consider 3 as a mapping on the extended domain, 
s.t. for every p ^ C or p G U (for some implication signs ^ and 
=>): 3{tb^^^p) (i.e., A’s total occurrent central state belief that [p is true]) 
is the set of parameter-settings 5 , s.t. 

1. s G 3{b^^"^p), and 

2. for all G £ U U (for some implication signs and =^): if 
s G 3{b^^^'ip) then 3{b^^^'ip) is a substate of 3{b^'^p) 

We say that s N AB^^^p for s = { 5 ^, 5 ^’^, 5 ", nc(5^’^), na(s*^’^)) G S 

if and only if s G 3{tb^'^p). 

A total occurrent central state belief of A is any such state 3{tb^'^p) for 
some p ^ C or p ^ C-^ U (for some implication signs — ^ and =>). 
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• (The Explication of ^’s Total Dispositional Central State Beliefs Recon- 
sidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form where (f ^ C or (f e U (for some implication 

signs ^ and =>), and we consider as a mapping on the extended domain, 
s.t. for every (/? G £ or (/? G U (for some implication signs — ^ and 
^): 3{tb^^^(p) (i.e., A's total dispositional central state belief that [(f is 
true]) is the set of parameter-settings 5, s.t. 

1. s G 3{b^^^(f), and 

2. for all i/; G £ U U (for some implication signs ^ and =^): if 
s G J(6^’^'0) then 3{b^^^'tp) is a substate of 3{b^'^(p) 

We say that s N for 5 = (5^, 5", nc(s*^’^), na(s^’^)) G S 

if and only if 5 G 3{tb^^^(p). 

A total dispositional central state belief of A is any such state 3{tb^^^(p) 
for some (p e C or (p e U (for some implication signs ^ and =>). 

• (The Explication of A’s Total Central State Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form tb^(p^ where G £ or (/? G (for some implication signs 

^ and ^), and we consider J as a mapping on the extended domain, s.t. 
for every ip e C or (p £ U (for some implication signs ^ and =>): 

3{tb^(p) (i.e., A’s total central state belief that [(p is true]) is the set of 
parameter-settings 5 , s.t. 

1. 5 G 3{b^(p), and 

2. for all -0 G £ U U (for some implication signs ^ and =^): if 
s G 5(6^0) then 3{b^'i/j) is a substate of 3{b^(p) 

We say that s N AB^ip for s = (s^, 5 ^’^, 5 ^’^, 710 ( 5 *^’^), na(s^’^)) G S' if 

and only if 5 G 3{tb^(p). 

A total central state belief of A is any such state 3{tb^(p), or a total oc- 
current central state belief 3{tb^'^(p), or a total dispositional central state 
belief 3{tb^'^cp)^ for some (p ^ C or (p G U (for some implication 
signs and ^). 

• (The Explication of A’s Total Occurrent Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form tb^ip^ where G £ or G (for some implication signs 

and ==>), and we consider 3 as a mapping on the extended domain, s.t. 
for every (p G jC or (p G U C=^ (for some implication signs and 
=>): 3{tb^(p) (i.e., A’s total occurrent belief that [(p is true]) is the set of 
parameter-settings 5 , s.t. 
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1. 5 E 3{b^(f), and 

2. for all t/; E £ U U (for some implication signs and =>): if 
s E 3{b^'ip) then 3{b^'ilj) is a substate of 3{b^(p) 

We say that s N AB^(f for s = 5 ", nc(s^’^), na(s^’^)) E 5 if 

and only if 5 E 3{tb^(p). 

A total occurrent belief of A is any such state 3{tb^(p)^ or a total percep- 
tual belief 3{tbP(f)^ or a total occurrent central state belief 3{tb^^^(p), for 
some (p e C or ip e U (for some implication signs ^ and =^). 

• (The Explication of A’s Total Dispositional Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form tb^(p, where ip ^ Cor p ^ £_^U£^ (for some implication signs 
and =>), and we consider 3 as a mapping on the extended domain, s.t. 
for every E £ or E £-^ U £^ (for some implication signs ^ and =^): 
3{tb^p) (i.e., A’s total dispositional belief that [p is true]) is the set of 
parameter-settings 5 , s.t. 

1. s E 3{b^p), and 

2. for all -0 E £ U £^ U C^ (for some implication signs ^ and =>): if 
s E 3{b^'ip) then 3{b^'il)) is a substate of 3{b^p) 

We say that s 1= AB^p for 5 = 5 ^’®, 5 ^, nc(s'^’'^), na( 5 ^’"^)) E 5 if 

and only if 5 E 3{tb^p). 

A total dispositional belief of A is any such state 3{tb^p) for some p e C 
or p e £^ U C=^ (for some implication signs and =^). 

It follows that 3{tb^p) = 3{tb^'^p). 

We finally end up with a notion of a total belief: 

• (The Explication of A’s Total Beliefs Reconsidered) 

We extend the domain of the interpretation mapping 3 by all expressions 
of the form tbp^ where p E C or p E C^UC=^ (for some implication signs 
and =^), and we consider J as a mapping on the extended domain, 
s.t. for every p E C or p e £^ U £^ (for some implication signs ^ 
and =^): 3{tbp) (i.e., A’s total belief state that [p is true]) is the set of 
parameter-settings s, s.t. 

1. s E 3{bp), and 

2. for all ^ E £ U £_^ U C=^ (for some implication signs ^ and =>): if 
s E 3{b'ip) then 3{b'ilj) is a substate of 3{bp) 
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We say that s \= AB(f for s = (5^, 5^’^, 5®, nc(5^’^), na(5^’^)) G 5 if 

and only if 5 G 3{tb(p). 

A total belief of A is any such state 3{tb(p), or a total occurrent belief 
3{tb^ip), or a total dispositional belief 3{tb^(f)^ or a total perceptual belief 
3{tbP(p), or a total central state belief 3{tb^(p), or a total occurrent central 
state belief 3{tb^^^(p), or a total dispositional central state belief 3{tb^^^(p), 
for some 99 G £ or G U C=^ (for some implication signs ^ and =^). 

Speaking metaphorically, 3{tlf(p), 3{tb^^^(p), 3{tb^^^(p), 3{tb^(p), 3{tb^(p), 
3{tb^(p)^ and 3{tb(p) are generally “small” sets, since they are subsets of all the 
states 3{b^^^'ip), ^nd 3{b(p), respectively, 

which we have referred to in the definitions of the very total belief states. 

As a corollary, we get: 

Corollary 195 

1. For all 5^,52 G S'^: if N AB^^p and 1= AB^(p, then for all '0 G 
C U £_^ U (for some implication signs and ^ ): 

s{\=BPip iffsl^BP'if 

2. for all G if sl'^ 1= AB^^^ip and 53^ b AB^^^p), then for all 

G £ U C-. U £^ (for some implication signs and => ): 

iff s^f^ ^ 

3. for all 5i’^,52^ G if s^f^ N AB^'^p and s^^ N AB^^^p, then for all 

-0 G £ U £_^ U £=^ (for some implication signs — ^ and ): 

1= iff N V 

for all <sJ,52 G S^: if sf N AB^p and S 2 N AB^p, then for all 'll; G 
£ U £_^ U £^ (for some implication signs and => ): 

sl\^B^'ilj iffs^2^B^'ip 

5. for all sJ,52 G S^: if s^ N AB^p and S 2 N AB^p, then for all '0 G 
£ U £_a U £=^ (for some implication signs and ^ ): 

5? iffs^^B^'il; 

6. for all sf^S 2 G S^: if sf N AB^p and S 2 N AB^p, then for all 0 G 
£ U £^ U £^ (for some implication signs and =>): 

sf N B^'ip iff sf N B^'il) 

1. for all si, 52 G S: if s\ N ABp and 52 N ABp, then for all'll G £U£_^U£^ 
(for some implication signs — ^ and =>): 

Si N 50 ijf 52 N 50. 
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The proof of 1 is by indirect assumption that there is a belief 
s.t. is a member of 3(6^ t/;), but S 2 is not. The proofs of the other claims 
are analogous. Corollary 195 expresses that the members of a total belief are 
indistinguishable in terms of belief ascription - but they may of course differ in 
other ways. However, if A were a system having states of belief as her only men- 
tal states, then having more than one parameter-setting s s.t., e.g., s h ABif^ 
would be redundanf as far as A’s mental states are concerned; redundancy 
would be avoided in the case where 3{tb(f) were a singleton. 



21.5 (Direct) Causation and Sustaining of Belief States by Belief 
States 

By combining def.187 and our improved explication of belief concepts above, we 
are led to the following account of causation sentences and sustaining sentences 
expressing the direct causation/sustaining of beliefs by beliefs or total beliefs; 
we also add the (not equally important) definition of creation of beliefs by 
beliefs or total beliefs: 

Definition 196 (Direct Causation/ Sustaining /Creation of Beliefs by Beliefs 
or Total Beliefs Reconsidered) 

Let {s{t))^^J^ be a possible trajectory of A, s.t. for all t G In: 
let t G In: 

1 . Causes{b<fi,b'ip) iff 

s{t — Ate) ^ 3{b(p), and A is disposed in s(t — At/) to change (after the 
amount Ate of time) to 3{b'ip) given 3{b(f) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

2. t/s{t))^^j^\= Causes{bP(f,b^'^'il;) iff 

s{t — Ate) € 3{bPif), and A is disposed in s{t — Ate) to change (after the 
amount Ate of time) to 3{b^^^/)) given 3{bP(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

3. t, [s{t))^^j^ N Causesib^^^ip.b^^^'/) iff 

s{t — Ate) ^ 3{b^'^(f), and A is disposed in s{t — Ate) to change (after 
the amount Ate of time) to 3{b^^^/) given 3{b^'^(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 
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4 . Causes{tbip,btp) iff 

s{t — Ate) ^ A is disposed in s{t — Ate) to change (after the 

amount Ate of time) to 3{b^JJ) given 3{tb(f) 

( and given that the perceptual input to A is constant for the amount Ate 
of time) 

5. t, {s{t))^^j^ 1= Causes{tbP(f^b^'°'il)) iff 

s{t — Ate) G 3{tbPip), and A is disposed in s{t — Ate) to change (after the 
amount Ate of time) to 3{b^^^'ip) given 3{tbPip) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

6. t, (^(0)tG/n ^ Causes{tb^^^g),b^^^'il)) iff 

s{t — Ate) ^ 3(tb^^^<p), and A is disposed in s{t — Ate) io change (after 
the amount Ate of time) to 3{b^'"^'if) given 3{tb^^^<p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

7. t, {s{t))^^j^ N Creates{b(f^ b'lp) ifft, {^i^))tein ^ Causes{b(f, b^p) and s{t — 

Ate) ^ 3{bi;) 

8. t,{s{t))^^j^ t= Creates{lf(p,b^^^'il)) iff t, {s{t))^^j^ N Causes{bP(f,b^'^'ip) 
and s{t — Ate) ^ 3{b^^^^p) 

9. t, {s{t))^^j^ N Creates{b^^^ip,b^^^'ip) ifft, {s{t))^^j^ N Causes{b^^^(f,b^^^'ip) 
and s{t — Ate) ^ 3{b^^^'ip) 

10. (^(0)tG/n ^ Creates{tb(f,b'ip) iff 1 , {s{t))^^j^ N C auses{tb<p ^b'lp) and 
s(t - Ate) ^ 3{b%p) 

11. l,{^{^))tein ^ Creates{tbP(f,b^^^'ip) iff t, {s{t))^^j^ \= Causes{tbP(f,b^'^'ip) 
and s{t — Ate) ^ 3{b^'^^p) 

12. t, [s{t))^^j^ N Creates{tb^^^(f,b^'^'ip) ifft, (<§(0)te/n ^ Causes {tb^^^(p,b^'^ip) 
and s{t — Ate) ^ 3{b^^^'ip) 

13. s{t) 1= Sustains{b(f,b'ip) iff 

s{t) G 3{b(f), s{t) G 3{b'ip), and A is disposed in s{t) to remain (for the 
amount Ate of time) in 3 {b 2 p) given 3{b(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 
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14 - s{t) N Sustains{lP^,b^^^^) iff 

s{t) G 3{bPip), s{t) G and A is disposed in s{t) to remain (for the 

amount Ate of time) in 3{b^^^'ip) given 3{bP(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

15. s{t) Sustains{b^'^(p,b^'^'il;) iff 

s{t) G s{t) G 3{bxl;), and A is disposed in s{t) to remain (for the 

amount Ate of time) in given 3{b^'^(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

16. s{t) h Sustains{tb(f,b'ip) iff 

s{t) G 3{tb(f), s{t) G '5{b'ip), and A is disposed in s{t) to remain (for the 
amount Ate of time) in 3{b'ilj) given 3{tb(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

17. s{t) Sustains{tbPip^b^'^'i{j) iff 

s{t) G 3{tlP(p), s(t) G 3{b'ilj), and A is disposed in s{t) to remain (for the 
amount Ate of time) in 3{b^^^'i/j) given 3{tbP(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time ) 

18. s{t) ^ Sustains{tb^^^if,b^'^'ilj) iff 

s{t) G 3{tb^'^(p), s{t) G 3{b'ip), and A is disposed in s{t) to remain (for 
the amount Ate of time) in given 3{tb^'^(p) 

( and given that the perceptual input to A is constant for the amount Ate 
of time). 

Note that in our definitions of sustaining, 's{ty could (and actually 
should) be replaced by the non-complex ‘s’, but it is easy to see that our using 
the temporal variable ‘t’ does not cause any formal problems in these cases. 

21.6 Processes 

Finally, we can define what a mental process is in the way sketched on p.56 in 
chapter 4: for “practical” reasons, let us only focus on finite trajectories oi A. 

Let R be the set of finite trajectories (sq, . . . , Sn) of parameter-settings 
(for A), where n G Nq. 
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Definition 197 (A’s Processes) 

1. A process Pr (of A) is a set of finite trajectories of parameter- settings 
(for A), i.e. Pr C R; p{R) is thus the set of processes 

2. let (sq, • • • 5 ^m) ^ 

A is in (5o, . • . , Sm) in the process Pr iff 
there is a {t^^ ... ffn) ^ Pf', s.t. 

(a) n ^ m, and 

(b) s/c = to, • • • , Sk-^n = tn for some A: G {0, . . . , m - n}. 

So if (s(t), . . . , s(t + m)) G R is the sequence of parameter-settings of 
A from time t until time t + m, we can also say: 

A is from t up to t -h m in the process Pr iff 

there is a (to, • • • , tn) € Pr, s.t. n ^ m, and s{k) — to, ... , s{k-{-n) = tn 
for some A: G {0, . . . , m — n}. 

In words: ^ is in (s(t), . . . , s(t + m)) in the process Pr iff there is a se- 
quence (to, . . . , tn) G Pr s.t. (to, . . . , tn) is a subsequence of {s(t ), . . . , s(t + m)) 
(i.e., if the latter sequence is processed, then also the former). 

We use ‘Pr’ (with or without indices) as a variable ranging over pro- 
cesses. Instead of saying that A is in ... in the process Pr, we also say that Pr 
takes place in A in or within . . . 

Subprocesses may be defined as follows: 

Definition 198 (Subprocesses) 

A process Pr\ (of A) is a subprocess of a process Pr 2 (of A) iff 

for all {so, . . . , Sm) G Pr 2 there is a {to, • • > ,tn) ^ Pr\, s.t. 

1. n ^m, and 

2. Sfc ^ to, . . . , Sk+n = tn for some A: G {0, . . . , m - n}. 

This is motivated by the following corollary: 

Corollary 199 

Pri is a subprocess of a process Pr 2 iff 

for all (so, . . . , Sm) G R: if A is in {so, • • • , Sm) in the process Pr 2 , then 
A is in {so, Sm) in the process Pr\ . 

Moreover, we can define (omitting action states again): 
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Definition 200 (Processes Leading from Occurrent Belief States to Occurrent 
Belief States) 

For all kinds of states X, Y: 

a process Pr leads from X to Y iff 

for all {sq, . . . ,Sn) E Pr: Sq e X and Sn ^Y. 

Instead of saying that a process Pr leads from X to F, we can also say 
that Pr leads to the state-transition from X to F, or that Pr brings about the 
state-transition from X to X. 

This immediately implies (together with our characterization of belief 

states) : 

Corollary 201 

1. A process Pr leads from 3{b(f) to 3{b'ip) iff 
for all (so, . . . , Sn) G Pr: sq \= B(p and Sn ^ 

2. a process Pr leads from 3{tb(f) to 3 {b\{j) iff 

for all (5o, . . . , G Pr: sq N AB(p and Sn N B'lp 

3. a process Pr leads from 3{bP(p) to 3{b^^^'ip) iff 

for all (so, • • • , G Pr: sq N B^ip and Sn N B^^^^ 

a process Pr leads from 3{tlF(f) to 3{b^^^'ilj) iff 
for all (^ 0 , . . . , Sn) G Pr: sq N AB^(f and Sn 

5. a process Pr leads from 3{b^^^(p) to 3{b^'^tlj) iff 
for all {so , . . . , Sn) G Pr: Sq t= B^'^ip and Sn N 

6. a process Pr leads from 3{tb^'^p) to 3{b^^^ip) iff 
for all (so, . . . , Sn) G Pr: sq N AB^'^p and Sn N 

It is easy to see that if a belief or total belief directly causes another 
belief, then there is a certain process that leads from the former belief to the 
latter one: 

Corollary 202 (Direct Causation by Beliefs or Total Beliefs Involves Pro- 
cesses) 

Let {s{t))^^j^ be a possible trajectory of A, s.t. for all t £ In: 
s{t) = 
let t G In: 
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1. ^ Causes{bip, b^p) iff there is a process Pr 

that leads from {5 G S' |s N B(f, s^^^ = s^^^(t — Ate) } to s,t. 

Pr = {{s{t - Ate), ■ ■ ■,s{t)) I {s{t - Ate), . . ■ ,s{t)) a traj., - Ate) 

= - Ate)}, 

where A is in {s{t — Ate), ■ ■ ■ , s{t)) in the process Pr 
(i.e., (s{t - Ate), • • • , s{t)) e Pr) 

2. t, {s{t))^^jj^ 1= Causes{tb<p, btp) iff there is a process Pr 

that leads from {s G 5 |s 1= ABip, s'^’'^ = s'^'^{t — Ate) } to ‘^{tnp), s.t. 

Pr = {{s{t - Ate), . ■ ■,s{t)) I {s{t - Ate), ■ ■ ■ ,s{t)) a traj., s^’‘^{t - Ate) 
= s'^’^it - Ate)}, 

where A is in {sft — Ate), ■ ■ ■ , s{t)) in the process Pr 
(i.e., {s{t - Ate), ..., s{t)) e Pr) 

3. t, (s(t))tg/„ 1= Causes{lf<p, b^’°-ip) iff there is a process Pr 

that leads from {s G 5 |s 1= B^ip, — Ate) } to 3{b'^’°'ip), s.t. 

Pr = {{s{t - Ate), ■ ■ •,s(t)) I {s{t - Ate), ■ ■ ■,s{t)) a traj.,s'^'‘^{t - Ate) 
= s^’^^it - Ate)}, 

where A is in {s{t — Ate), ■ ■ ■ , s{t)) in the process Pr 
(i.e., {s{t - Ate), • • • , s{t)) G Pr) 

4- t, {s{t))^^J^ 1= Causes{tlP(p, b‘^’°ip) iff there is a process Pr 

that leads from {s G 5 |s 1= AB^cp, s^’‘^ = s‘^’'^{t — Ate) } to 3{b‘^'°i)>), s.t. 

Pr = {{s{t - Ate), ■ ■ ■ ,s(i)) I {s{t - Ate), ■ ■ -,Ht)) a traj., s^'^^it - Ate) 
= s‘^'‘^{t - Ate)}, 

where A is in {s{t — Atc )^ . . . , s(t)) in the process Pr 
(i.e., {s{t - Ate), ■■■, s{t)) G Pr) 

5. t, 1= Causes{b^^^ (f, b^^^xp) iff there is a process Pr 

that leads from {5 G S |s 1= — Ate) } to '3{b^'^\p), s.t. 

Pr = {{s{t - Ate ), . . ■,s{t)) I (s{t - Ate), ■ ■ ■ , s(^)) a traj., - Ate) 

= s^^\t - Ate)}, 

where A is in {s{t — Ate), • • • , s{t)) in the process Pr 
(i.e., {s{t - Ate ), . . • , s{t)) G Pr) 

6. t, (s{t))^^j^ N Causes{tb^^^(f, b^'^'ip) iff there is a process Pr 

that leads from {5 G S |5 N AB^'^ip, s^^^ = s^^^{t — Ate) } to 3{b^^^^), s.t. 

Pr = {{s{t - Ate), ■ ■ • , s(^)) I {s{t - Ate), ■ ■ ■,s{t)) a traj., - Ate) 

= S<^’^{t - Ate)}, 
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where A is in {s{t — Ate), • • • , s{t)) in the process Pr 
(i.e., {s{t - Ate), ■■■, s{t)) e Pr). 

An analogous corollary holds for direct sustaining: 

Corollary 203 (Direct Sustaining by Beliefs or Total Beliefs Involves Pro- 
cesses) 

Let s be a parameter- setting of A, s.t. s = (s^, 5 ^’^, 5 ^, nc(5^’^), 

na{s^^^)): 

1. s Sustains{b(f, b^|;) iff there is a process Pr 

that leads from {s G 5 |s N B(p,s h Bif, s^'^ = s^^^ } to 3{b'i!j), and that 
remains in ^ibip) for an amount Ate of time, s.t. 

Pr = {{s{t - Ate), ■ ■ ■,s(t)) I {s{t - Ate), ■ . .,s{t)) a traj., s‘^’‘^{t - Ate) 

= sc,<i} 

2. s^ Sustains{tb(p, b'l/j) iff there is a process Pr 

that leads from {s E S' |s N AB<p, s N B^jj, = s^'^ } to 3{b'ip), and that 
remains in 3{b'tp) for an amount Ate of time, s.t. 

Pr = {{s{t - Ate ), . . ■,s{t)) I {s{t - Ate), ■ ■ ■ ,s(i)) a traj., s‘^’‘^{t - Ate) 

3. s^ Sustains{lf(f, b^^^'ip) iff there is a process Pr 

that leads from {s G S |s 1= B^(p, s \= s^'^ — s^'^ } to and 

that remains in 3{b^'^'ip) for an amount Ate of time, s.t. 

Pr = {{s{t - Ate), ■ ■ ■,s{t)) 1 {s{t - Ate), ■ ■ ■ ,s{t)) a traj., s'^’'^{t - A^c) 

4- s N Sustains{tbP(fi, iff there is a process Pr 

that leads from {s G S |s N AB^ip, s N B^^^'ip, s^^^ = s^'^ } to dib^^^'ip), and 
that remains in for an amount Ate of time, s.t. 

Pr = {{s{t - Ate), ■ . .,s{t)) I {s{t - Ate), ■ ■ ■,s{t)) a traj., - Ate) 

5. s 1= Sustains(ff^^p, b^^^'ip) iff there is a process Pr 

that leads from {s G S |s h s N s^'^ = } to 3{b^'^ilj), and 

that remains in 3{b^'^'il)) for an amount Ate of time, s.t. 

Pr = {(s{t - Ate), ■ • • ,s{t)) I {s{t - Ate), . . . ,s{t)) a traj., - Ate) 
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6. s\=^ Sustains iff there is a process Pr 

that leads from {s € 5 |s 1= s N } to 3{b^^^'ip), 

and that remains in 3{b^^^'ip) for an amount Ate of time, s.t. 

Pr = {{s{t - Ate), ■ ■ -,s{t)) I {s{t - Ate), ■ ■ ■ ,s{t)) a traj., - Ate) 

We have omitted the rather obvious formal statement of ‘Pr remains 
in . . Similar theorems hold for indirect causation and sustaining. 

Finally, we can also define what it means to say that a process causes 
a state at some time t (relative to a trajectory {^{t))^^j^), and that a process 
sustains a state at some time t (relative to a trajectory {s{t))^^j^)] a process is 
defined to produce a state at some time t (relative to a trajectory (s(t))^^/^) if 
and only if it causes or sustains the state at t (relative to a trajectory {s{t))^^j^)\ 

Definition 204 (Causation, Sustaining, Production by Processes) 

Let {s{t))^^^^ he a possible trajectory of A; let Pr be process, let X be 
a state; let t ^ In: 

1. Pr causes X at t (relative to {s{t))^^j^) iff 

(a) A is in s{t) in the state X (i.e., s{t) ^ X) 

(h) A is in {s{t — Ate), • • • , s{t)) in the process Pr 
(i.e., {s{t - Ate), • • • , s{t)) e Pr) 

2. Pr sustains X at t (relative to {s{t))^^j^) iff 

(a) for all t' , s.t. t ^ t' ^ t -f Ate-' A is in s{t') in the state X (i.e., 
s{t')eX) 

(b) A is in (s{t), . . . ,s{t Ate)) i'fi the process Pr 
(i.e., (s{t ), . . . , s(t H- Ate)) € Pr) 

3. Pr produces X at t (relative to [s[t))^^j^) iff 

Pr causes X at t (relative to {s{t))^^j^) or Pr sustains X at t (relative 
to {s{t))^^jJ. 

Inferences have been defined in def.27 and 29 by reference to def.22 
(while using the relevant definitions from chapter 4). 




Chapter 22 

GOLDMAN’S RELIABILITY ACCOUNT OF JUSTIFIED 

BELIEF 



Goldman’s reliability account of justification is just one out of a variety of ex- 
ternalist accounts (including the “early” Goldman’s [65] approach which pro- 
poses a causal but still nonreliabilist theory of knowledge). Moreover, there are 
several brands of reliability approaches to justification (e.g., Armstrong’s [9] re- 
liable indicator approach, counterfactual theories like Nozick’s[115], etc.; for 
an overview see Goldman[69], pp. 43-51, and Audi[12], pp. 223-229). But Gold- 
man’s process-reliability account seems to us to be the most favourable one 
as far as an externalist and low-level theory of justified inference is concerned. 
Goldman has suggested at least four different versions of his reliability theory of 
justified belief, and we shall sketch briefly all of these versions in the following 
sections; at the end of this chapter we will deal with the problems that have 
been argued to affect the different versions. The aim of this chapter is not the 
exegesis of Goldman’s account in itself but rather to highlight those facets of 
Goldman’s theories that are relevant for our own theory, and also to point out 
some of the differences between our own account and Goldman’s. For the same 
reason, our comments on Goldman’s theories are by no means intended to give 
a “complete” survey of his elaborate epistemological contributions to reliabilist 
justification. The subsequent exposition is of course devoted to readers who did 
not yet study Goldman’s theory at all or who do not know all of the versions 
in which it has been stated. 



22.1 The Reliability Approach in “What is Justified Belief?” 

In [67], p.l07, Goldman characterizes a theory of justified belief as “a set of 
principles that specify truth-conditions for the schema '"iS’s belief in p at time 
t is justified"', i.e., conditions for the satisfaction of this schema in all possible 
cases.” After pointing out that the “largest common divisor” of most or even 
all instances of belief, which we intuitively call ‘justified’, seems to be their 
reliability, and since all of the intuitively unjustified beliefs seem to be the 
outcomes of unreliable procedures, Goldman suggests the following principle: 
“If S^s belief in p at t results from a reliable cognitive process, and there is 
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no reliable or conditionally reliable process available to S which, had it been 
used by S in addition to the process actually used, would have resulted in 
S's not believing p at t, then S's belief in p at t is justified.” (p.l23) This is 
the more advanced formulation of the original idea expressed on p.ll6: “If 5’s 
believing p at t results from a reliable cognitive belief-forming process (or set 
of processes), then S^s belief in p at t is justified.” Goldman considers these 
formulations as explications of our ordinary standards for justification; they 
express the substantive conditions under which a belief is justified, s.t. the 
specification of these conditions avoids any epistemic terms and explains why a 
belief is justified. Reliability consists in “the tendency of a process to produce 
beliefs that are true rather than false” (p.ll3) where the “term ‘tendency’ could 
refer either to actual long-run frequency, or to a ‘propensity’, i.e., outcomes that 
would occur in merely possible realizations of the process” (p.ll4); according 
to Goldman, our ordinary conception of justifiedness is too vague to determine 
in what respect ‘tendency’ is to be understood. But Goldman acknowledges 
that this is precisely the point where reliability turns to be problematic: where 
should reliability be measured? In the actual world, or in “ ‘natural’ situations” 
(p.l20), i.e., - as we shall put it - in the normal worlds? Moreover, Goldman 
seems to understand reliability (at least in some passages) not in an observer- 
independent way, when he says: “Since we believe that wishful thinking is an 
unreliable belief- forming process, we regard beliefs formed by wishful thinking 
as unjustified. What matters, then, is what we believe about wishful thinking, 
not what is true (in the long run) about wishful thinking” (Goldman[67], p.l21); 
and similarly on the same page: “According to our theory, a belief is justified 
in case it is caused by a process that is in fact reliable, or by one we generally 
believe to be reliable” . 

Goldman also introduces the notion of conditional reliability as the 
reliability of processes leading from beliefs to beliefs: a process is conditionally 
reliable according to Goldman if and only if “a sufficient proportion of its 
output-beliefs are true given that its input-beliefs are true'’^ (p.ll7). Both the 
concepts of reliability and conditional reliability come in degrees which are given 
by the proportion of true beliefs among the beliefs produced by the processes in 
question. This proportion is called the ‘truth ratio’ of a process in Goldman[69], 
p.26. Accordingly, we might define the degree of justification of beliefs as the 
truth ratio of the process which has caused the belief. 



22.2 The Reliability Approach in ‘‘Epistemology and Cognition” 

In [69] three truth-linked standards of epistemological evaluation are presented: 
(i) reliability: “An object (a process, method, system, or what have 
you) is reliable if and only if (1) it is a sort of thing that tends to produce beliefs, 
and (2) the proportion of true beliefs among the beliefs it produces meets 
some threshold, or criterion, value. Reliability, then, consists in a tendency 
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to produce a high truth ratio of beliefs . . . The Reliability standard will be 
invoked in connection with the evaluative notion of justifiedness. More precisely, 
reliability is one component in a complex standard appropriate to justification 
. . . A reliable process, method, or procedure is an antidote to error.” (p.26). 

(ii) power: “Power is the capacity of a process, method, system, or 
what have you to produce a large number of true beliefs; or . . . the capacity to 
produce true beliefs in answer to a high ratio of questions one wants to answer or 
problems one wants to solve ... A method or system can be very reliable without 
being very powerful; and a method or system can be pretty powerful but not 
terribly reliable. I believe that the power standard, especially the problem- 
solving variant, is associated with the evaluational term ‘intelligent’.” (p.27). 
Power is regarded by Goldman as the antidote to ignorance. 

(iii) speed: “that is, speed in getting true beliefs.” (p.27). 

E.g., a process which constantly produces the belief in the tautology 
‘The sky is blue or it is not’ is extremely reliable, but not very powerful; a 
process producing all beliefs whatsoever would be extremely powerful but not 
reliable at all. Moreover, if a process is indeed both reliable and powerful, but 
it takes 100000 years until it generates its first output belief, the process will 
nevertheless be epistemically worthless because it lacks speed. 

Since Goldman is particularly interested in the justification of beliefs, 
reliability is the standard he focuses on. He distinguishes two kinds of reliability 
corresponding to two classes of processes: 

(a) first-order processes: those are the belief-forming processes. The 
reliability of such processes is called ‘first-order reliability’. 

(b) second-order processes: the processes “that produce or modify 
belief- forming processes or methods. A second-order process may be called 
second-order reliable if the processes it tends to produce are reliable, or, alter- 
natively, if the modifications it introduces tend to increase reliability.” (p.27). 

The class of processes is also divided into another pair of subclasses: 

(A) basic (elementary, native) processes: they are part of the fixed, 
native architecture of the cognitive system, i.e., they are not acquired by some 
learning process; as Goldman writes, they are “. . .original, innate features of 
the system” (p.366); in [70], p.l28, he describes them as “wired-in features of 
our native cognitive architecture”, and in [73], p.l4, as parts of “a person’s 
fundamental cognitive architecture” . 

(B) acquired processes: they are not basic; in [73], p.l4, he characterizes 
them as “something that is not part of one’s fundamental cognitive architecture, 
but something learned”. Goldman calls such processes ‘methods’, including 
“algorithms, heuristics, skills and techniques of various sorts”, “or learnable 
methodologies” (p.93). In [70], p.l29, Goldman adds examples like “procedures 
that appeal to instrument readings, or statistical analyses”. Methods have to 
be acquired (and maybe sustained) by second-order processes.* 

*On p.391, Goldman[69] admits; “A fully satisfactory formulation of the process/method 
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Two corresponding levels of justification are distinguished: 

(I) primary just ifiedness (P-justifiedness): “P-justifiedness results from 
the use of approved [basic] processes” (p.93). 

(II) secondary justifiedness (S- just ifiedness): “S-just ifiedness results 
from the use of approved methods” (p.93). 

Full justifiedness is the combination of P- and S-just ifiedness. But 
Goldman concentrates on P-justifiedness, i.e., the rules of justification which 
he actually considers only permit basic cognitive processes - see Goldman’s 
ARI-criterion below. 

Goldman approaches his account of justified belief in terms of a rule 
framework, and develops his theory on three stages or levels: 

1. the level of the framework principle 

2. the level of the criterion 

3. the level of the J-rule system. 

The framework principle “is intended to express a semantic truth about 
the language of justified belief” (Goldman[69j, p.59), and is presented in dif- 
ferent versions, which we will denote by Goldman’s original terms: 

• (PI) 

“5’s believing p at time t is justified if and only if iS’s believing p at t is per- 
mitted by a right system of justificational rules (J-rules)” (Goldman[69], 
p.59) 

(PI) is later strengthened to^: 

• (P3) 

“5’s believing p at t is justified if and only if 

(i) S^s believing p at t is permitted by a right system of J-rules, and 

(ii) this permission is not undermined by S^s cognitive state at t.” (Gold- 
man[69], p.63) 

‘undermining’ is explained as follows: “a permission for S to believe p 
is undermined if S is permitted (by right J-rules) to have a belief in the denial 
of this (lower-level) permission.” ^ 

But Goldman does not really focus on (P3), but rather (for the sake 
of simplicity) on (PI). (PI) is later also strengthened in a different way: 

distinction. . . has eluded me. I shall rely throughout on an informal grasp of this distinction.” 

^ Goldman also suggests a principle (P2) but only to abandon it instantly. 

^This strongly reminds one of the defeasibility approaches to justification: see e.g. 
Pollock[125], chapter 7. 
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• (PI*) 

“A cognizer’s belief in p at time t is justified if and only if it is the final 
member of a finite sequence of doxastic states of the cognizer such that 
some (single) right J-rule system licenses the transition of each member 
of the sequence from some earlier state(s)” (Goldman[69], p.83). 

In the same way as (PI) may be strengthened to (PI*), also (P3) 
might be strengthened to a framework principle (P3*), but this is not done by 
Goldman. 

As Goldman points out, “(PI) links justifiedness with J-rule permis- 
sion^ rather than obligatoriness. . . For example, if a person has a certain corpus 
of prior beliefs, the rules might permit him to infer a further proposition that 
logically follows from this corpus. But the rules might not mandate this infer- 
ence” (Goldman[69], p.60). 

Moreover, Goldman emphasizes that “systems of J-rules are assumed 
to permit or prohibit beliefs, directly or indirectly, as a function of some states, 
relations, or processes of the cognizer” Goldman[69], p.60). This formulation 
seems to indicate an internalist conception. But whether (PI) actually leads 
to an internalist or to an externalist theory, depends on what a “right” J- 
rule system looks like; e.g., if J-rules are later assumed to permit beliefs as 
a function of the reliability of the agent’s processes in the actual world, an 
internalist approach is abandoned. By ‘rule’ Goldman does not refer to the 
kind of rules which are guiding agents: “A person need not even understand the 
rules, and if he does, he need not be able to apply them in the process of belief 
formation.” (Goldman[69], p.59). Goldman just speaks of a rule framework in 
order to emphasize the normative and regulative dimension of justification. 

The reason why Goldman epistemically assesses rule systems rather 
than single rules is this: “rules are interdependent with respect to their epistem- 
ically relevant properties. In particular, they are interdependent with respect 
to truth-ratio properties. This is especially clear for inferential rules. A rule of 
‘detachment’, for example, might not spell trouble if complemented by certain 
rules, but will breed what is known as the lottery paradox if complemented 
by other rules. A sound inferential rule will generate additional true beliefs 
when applied to true input beliefs. But if other rules permit false beliefs to 
be formed, then even a sound inferential rule may produce innumerable errors. 
Truth-ratio propensities, then, only make sense as applied to rule systems, not 
isolated rules” (Goldman[69], p.ll5). 

(P3) adds a clause of epistemic “meta” -permission to (PI). In (PI*) the 
J-rule system does not license some of the agent’s cognitive states^ but rather 
the transition between states; (PI*) is the format to be used by Goldman most 
of the time in [69] . 

Neither of the principles above tells us anything about the conditions 
under which a belief is justified; they just relate the justifiedness of a belief 
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to the rightness of a system of J-rules. What we need now is “a very general 
set of conditions that are necessary and sufficient for a system of J-rules to be 
right” (Goldman[69], p.64). Such a set of conditions is called a “criterion of 
J-rule rightness” by Goldman. What such a criterion looks like also depends on 
what form the J-rules should have. In particular, it should be specified what 
kinds of objects are to be permitted by such rules: “A direct form of J-rules 
would expressly permit certain beliefs, or would present schemas for belief 
permission. For example, a rule might permit belief in any proposition that has 
a certain type of relation to other propositions already believed. An indirect 
form of J-rules would license the execution of certain belief-forming processes 
or operations; the rules would not specify the permitted beliefs, but would 
indirectly sanction the belief output of these licensed processes” (Goldman [69], 
p.75). Goldman opts for the indirect form of J-rules, since this conforms to his 
process reliabilism, which takes the “history” of a belief into account. Goldman 
also argues that processes may not be exchanged for cognitive state- transit ions, 
since if we did so, “no constraints on the specific causal path by which the belief 
is formed” would be placed. 

Thus he presents the following criterion, which is intended to be applied 
to first-order processes: 

• (ARI)§ 

“A J-rule system R is right if and only if R permits certain (basic) psycho- 
logical processes, and the instantiation of these processes would result in 
a truth ratio of beliefs that meets some specified high threshold (greater 
than .50)” (Goldman[69j, p.l06). 

As Goldman adds: “Since (ARI) does not designate any particular 
threshold, it is really a criterion- 5c/iema” (Goldman[69j, p.l06).^ 

Similar to [67], “the following question arises: Is the rightness of a 
rule system determined by its truth ratio in the actual world, and in that 
world only? Or should the performance of the rule system also be judged by 
its performance in other possible worlds? Or is a still different performance 
measure appropriate? Obviously, a given rule system could perform well in one 
possible world — say the actual world — and poorly in another. Which possible 

§‘ARF stands for absolute, resource- independent, where ‘absolute resource-independent’ 
is meant to contradict ‘resource-relative’; “A resource- independent criterion fixes an ac- 
ceptable truth ratio without regard to the resources of the. . .cognitive system in question” 
(Goldman[69], p.l04). 

^Curiously, Goldman first argues that state-transitions have to be replaced by the more 
“fine-grained” processes, but then the latter ones are only judged by (ARI) according to their 
outputs, i.e., according to which state-transition they lead to. Thus, Goldman’s preference 
for processes only seems to be necessary if the processes in question are not basic, since then 
it is indeed important which second-order processes have caused the first-order processes 
considered, and here more than just the input-output relation defined by a first-order process 
is asked for. 
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worlds are relevant to the rightness of a rule system, and ultimately to the 
justifiedness of a belief formed in compliance with the system?” (Goldman[69], 
p.106). This issue is connected to another of Goldman’s questions: “Is a J-rule 
system that is right in one possible world also right in all possible worlds? In 
other words, is rightness a ‘rigid’ designator?” (Goldman[69], p.106). 

The answer that Goldman gives with respect to the first question is that 
reliability is to be measured in the normal worlds, and his answer to the second 
question is affirmative. We state Goldman’s comment on these answers in the 
form of a rather lengthy citation, but the latter will turn out to be important 
in further respects: “We have a large set of common beliefs about the actual 
world: general beliefs about the sorts of objects, events, and changes that occur 
in it. We have beliefs about the kinds of things that, realistically, do and can 
happen. Our beliefs on this score generate what I shall call the set of normal 
worlds. These are worlds consistent with our general beliefs about the actual 
world. (I emphasize ‘general’, since I count worlds with different particular 
episodes and individuals as normal.) Our concept of justification is constructed 
against the backdrop of such a set of normal worlds. My proposal is that, 
according to our ordinary conception of justifiedness, a rule system is right in 
any world W just in case it has a sufficiently high truth ratio in normal worlds. 
Rightness is rigidified for all worlds; but it is rigidified as a function of reliability 
in normal worlds, not reliability in the actual world. Rightness of rules — and 
hence justifiedness — displays normal- world chauvinism. Obviously, the notion 
of normal worlds is quite vague. But I do not find that objectionable. I think 
our ordinary notion of justifiedness is vague on just this point” (Goldman[69], 
p.107). 

Thus, according to Goldman’s view, whether a world is normal or not, 
is determined by our general beliefs concerning our world, i.e., the actual world. 
The notion of world normality is therefore not an objective, agent-independent 
one: “beliefs are deemed justified when (roughly) they are caused by processes 
that are reliable in the world as it is presumed to be. Justification-conferring 
processes are ones that would be reliable in worlds like the presumptively actual 
world, that is, in normal worlds” (Goldman[69], p.108). We will discuss this 
point in more detail in subsection 22.5.3. 

Although Goldman does not explicitly extend his criterion to second- 
order reliability, he indicates that this might be done in essentially two ways: “A 
second-order process might be considered metareliable if, among the methods 
(or first-order processes) it outputs, the ratio of those that are reliable meets 
some specified level, presumably greater than .50 . . . A weaker conception of 
metareliability would merely require increases in reliability, not an absolute 
ratio of reliability. On this conception a second-order process would be metare- 
liable if it modifies or replaces processes (methods) so as always to increase 
levels of reliability” (Goldman [69], p.ll5). 
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Now that Goldman has related justifiedness to right systems of J-rules 
and he has given a criterion of rightness, he still has “to determine which 
systems of J-rules in fact satisfies the chosen criterion” (Goldman[69], p.63). 
But this step is omitted by Goldman: “I shall not attempt to specify in detail 
any system of J-rules that satisfies the chosen criterion” (Goldman[69], p.63)JI 



22.3 The Reliability Approach in “Strong and Weak Justification” 

In [70] Goldman distinguishes between “two distinct ideas or conceptions of 
epistemic justification. On the one conception, a justified belief is (roughly) a 
well-formed belief, a belief formed (or sustained) by proper, suitable, or ad- 
equate methods, procedures, or processes. On another conception, a justified 
belief is a faultless^ blameless^ or nonculpable belief” (p.l28). The first kind of 
justification is called ‘strong’, the second ‘weak’; strong justification is what 
is called ‘justification’ (simpliciter) in Goldman[68] and [69]. The notions of 
strong and weak justification may be applied either to (basic) processes or to 
methods (see the last section). Strong justification of methods is captured by 
a necessary and sufficient condition in the same way as justification has been 
characterized in [69]. Weak justification is introduced by Goldman essentially 
to accommodate the intuition that a cognitive agent should not be blamed for 
having a (strongly) unjustified belief, when her cultural, social, and scientific 
environment is actually sharing this belief. But this topic is definitely beyond 
the domain of our interest and thus we will disregard the concept of weak justi- 
fication in the following. In section VI of [70] , Goldman returns to the problem 
of “where” reliability is to be measured, and reconsiders his normal worlds ac- 
count in [69]: “A very special sense of ‘normal worlds’ is delineated. A normal 
world is understood as a world consistent with out general beliefs about the 
actual world, beliefs about the sorts of objects, events, and changes that occur 
in the actual world” (Goldman [70], p.l35). On pp.l36f Goldman summarizes 
both the merits and the shortcomings of his normal worlds account of relia- 
bility, and explains what he had thought would support the doxastic sense of 
“normalcy” which he employs: “What might rationalize the doxastic sense? In 
writing Epistemology and Cognition^ I had planned a chapter on concepts. One 
thesis I planned to defend is that our concepts are constructed against certain 
background assumptions, comprised of what we believe about what typically 
happens in the actual world. I expected this approach to concepts to underpin 
the doxastic-normalcy conception of reliability. Unfortunately, I did not man- 
age to work out such an approach in detail.” This leads to Goldman’s attack 
which is directed against his own normal- worlds version of reliabilism: “. . .there 
are a number of problems facing the account of justification that focuses on 
normal worlds (construed doxastically) . First, which general beliefs about the 

i^But in part III we indeed state a concrete system of J-rules for nonmonotonic inferences, 
s.t. this system may even be proven to be right. 
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actual world are relevant in fixing normal worlds? There seem to be too many 
choices. Second, whichever general beliefs are selected, it looks as if dramati- 
cally different worlds might conform to these beliefs. Does a rule system count 
as right only if it has a high truth ratio in all those worlds? Third, when the 
theory says that normal worlds are fixed by the general beliefs we have about 
the actual world, what is the referent of ‘we’? Is it everyone in the actual 
world, i.e., the whole human race? Different members of the human race have 
dramatically divergent general beliefs. How are the pertinent general beliefs to 
be extracted? Finally, even if these problems could be resolved, it isn’t clear 
that the normal- worlds approach gets things right. Consider a possible non- 
normal world W, significantly different from ours. In W people commonly form 
beliefs by a process that has a very high truth ratio in W, but would not have 
a high truth ratio in normal worlds. Couldn’t the beliefs formed by the process 
in W qualify as justified?” Goldman concludes that the normal- worlds version 
of reliabilism which he had developed in [69] has to be abandoned and must 
be replaced by an improved reliability account. The first account he suggests is 
also the most straightforward one: “a rule system is right in W just in case it 
has a high truth ratio in W” (Goldman[70], p.l37). But, as Goldman argues, 
a rule system might be used sparsely in a world, and its actual performance 
might therefore not truely reflect its degree of justifiedness. Goldman concludes 
on p.l37: “For this reason, it seems advisable to assess its [the rule system’s] 
rightness in W not simply by its performance in W, but by its performance 
in a set of worlds very close to W. In other words, we should be interested 
in the probability of a rule system yielding truths in the propensity, or modal 
frequency, interpretation of probability.” 



22,4 The Rel. Approach in ‘‘Epistemic Folkways and Scientific Epis- 
temology” 

[71] reconsiders the question whether epistemology should aim at the unravel- 
ling of our commonsense epistemic concepts, principles and norms (these com- 
monsense conceptions are termed ‘epistemic folkways’ by Goldman), or whether 
such an approach is misconceived right from the start. Goldman accepts that 
at least “one proper task of epistemology is to elucidate our epistemic folk- 
ways” (Goldman[71], p.l55). But he adds that “The second mission of epis- 
temology ... is the formulation of a more adequate, sound, or systematic set 
of epistemic norms, in some way(s) transcending our naive epistemic reper- 
toire” (Goldman[71], p.l56); between both tasks there should be some “conti- 
nuity.” As far as the first task is concerned, justificational evaluation envolves 
two stages, where the first one is called in Goldman[73], p.ll, the “standard- 
selection stage”, and the second one the “standard deployment” stage: “The 
first stage features the acquisition by an evaluator of some set of intellectual 
virtues and vices” (Goldman[71], p.l63). I.e, the epistemic evaluator acquires 
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“a mentally stored set, or list, of cognitive virtues and vices” (Goldman[71], 
p.l57), and this is done in the way that “Belief-forming processes . . . are deemed 
virtuous because they (are deemed to) produce a high ratio of true beliefs . . . 
Processes are deemed vicious because they (are deemed to) produce a low ra- 
tio of true beliefs” (Goldman[71], p.l60). “In the second stage, the evaluator 
applies his list of virtues and vices to decide the epistemic status of targeted 
beliefs” (Goldman[71], p.l63); “When asked to evaluate an actual or hypothet- 
ical case of belief, the evaluator considers the processes by which the belief was 
produced, and matches these against his list of virtues and vices. If the pro- 
cesses match virtues only, the belief is classified as justified. If the processes are 
matched partly with vices, the belief is categorized as unjustified. If a belief- 
forming scenario is described that features a process not on the evaluator’s list 
of either virtues or vices, the belief may be categorized neither justified nor 
unjustified, but simply nonjustified” (Goldman[71], p.l57). Here, a difference 
between unjustifiedness and nonjustifiedness is introduced. The matching be- 
tween the list of virtues and vices, and the beliefs to be judged, is not supposed 
to be perfect but rather based on similarity considerations. The notion of re- 
liability used by Goldman in [71] is specified in the way that “On the present 
rendering, it looks as if the folk notion of justification is keyed to dispositions to 
produce a high ratio of true beliefs in the actual world, not in “normal worlds” 
” (Goldman[71j, p.l64). 

In [71] the basic ideas of Goldman’s previous process-reliabilist ac- 
counts of justification is preserved but the format of the theory has been 
changed. An account of reliabilism, which is similar to the one just discussed, 
has also been published more recently by Goldman in [73]. 

22.5 Discussion: The Problems for Goldman’s Approach(es) 

22,5.1 Alleged Counterexamples 

It is well-known (see e.g. Audi[12], pp. 224-229, or Brendel[24], pp. 198-202) 
that reliabilist approaches to knowledge, and also to justified belief, suffer from 
certain deficits. There are several examples which seem to show that the relia- 
bility conception of justification is too weak in some respects, and too strong 
in another. Although neither the justification of beliefs nor knowledge is really 
our topic, since we aim at the justification of inference^ it is instructive to see 
what such counterexamples look like: 

Example 205 ( ‘^Showing That the Realiability Approach Is Not Sufficient 
for Justification and/or Knowledge) 

1. A typical hypothetical counterexample to the reliability approach is the 
‘^sweepstakes example”: “you might know that you hold just one out of a 
million coupons in a fair sweepstakes, which will have one winner. You 
may . . .infer that, with very high probability, 0.999999, that you will lose. 
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since 999,999 of the million coupons will lose. But you do not know you 
will lose” (Audi[12], p.l66; see also p.227). On the other hand, the in- 
ference drawn in this example seems to be quite reliable. Thus we seem 
to have an example of high reliability without knowledge, and therefore 
reliability does not seem to be sufficient for knowledge. 

2. Putnam[127] presents another counterexample: suppose the Dalai Lama is 
infallible. Then anyone who believes any statement the Dalai Lama makes, 
uses a method which is 100 percent reliable. According to reliabilism, every 
such belief is therefore also justified. But this is surely counterintuitive. 

3. A further example indicating the insufficiency of the reliability approach 
is given by BonJour[21]: in this case let there be an agent having com- 
pletely reliable clairvoyant power. The clairvoyance-caused beliefs should 
therefore be justified, and, since they are also true, constitute knowledge. 
But the agent might e.g. have explicit counterevidence against her posses- 
sion of a claivoyant power, and just ignores this evidence. Or the agent 
might have counterevidence against the clairvoyance- caused belief itself, 
and ignore this evidence. In all these cases we would say that the agent 
is not justified in the clairvoyance- caused beliefs, although they are the 
output of a reliable process. Therefore, reliability is not sufficient for jus- 
tifiedness. This problem is called the ^^meta-incoherence problem” by Sosa 
[166], p.l32. 

Jf.. Three further examples (this one, 5, 6) are given by Goldman: . .suppose 

our friend Humperdink has attended a series of talks on mathematics by 
a certain friend Elmer Fraud. These talks are not under the auspices of 
any certified educational institution, and Humperdink has been warned 
that Fraud has no credentials in mathematics. Humper-dink hears Fraud 
enunciate numerous principles and algorithms, almost all of them defec- 
tive. Nonetheless, being a complete novice — and a gullible one at that — 
Humperdink blindly accepts and applies them all. In one case, however. 
Fraud happens to teach a perfectly correct algorithm. Humperdink inter- 
nalizes this one along with the others, and applies it to a relevant class 
of problems. In using this algorithm to solve a problem, Humperdink gets 
the answer right and forms a true belief in the answer. This belief is 
the result of a reliable process, namely, the algorithm . . . Clearly, though, 
Humperdink should not be credited with knowledge” (Goldman[69[, pp.51- 
52) 

5. ‘‘Millicent in fact possesses her normal visual powers, but has cogent rea- 
sons to believe these powers are temporarily deranged. She is a subject 
of a neurosurgeon’s experiments, and the surgeon falsely tells her that 
current implantations are causing malfunction in her visual cortex. She 
is persuaded that her present visual appearances are no guide at all to 
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reality. Yet despite this belief, she continues to place credence in her vi- 
sual percepts. She ignores the well- justified belief in the incapacitation of 
her visual faculty; she persists in believing, on the basis of visual appear- 
ances, that a chair is before her, that the neurosurgeon is wearing a yellow 
smock, and so on. Now these beliefs are all, in fact, true. Moreover, they 
are formed by the usual, quite reliable, perceptual processes. But are they 
specimens of knowledge? Intuitively, no. The reason is that Millicent is 
not justified in holding these beliefs; they contravene her best evidence. 
It seems, then, that causation by reliable processes is not sufficient for 
knowing’’ (Goldman[69], p.53). 

6. ‘Maurice uses a reliable — but not perfectly reliable — heuristic to arrive at 
a certain belief p. Now the fact that the process is not perfectly reliable does 
not by itself preclude knowledge. I do not assume that perfect reliability 
is required. But there is another heuristic Maurice knows, which is more 
reliable than the first (though still not perfect), and Maurice believes it is 
more reliable. He even suspects, in this case, that the better heuristic might 
yield a different result, since it has differed from the first in similar cases 
before. But despite these beliefs, Maurice neglects the superior heuristic. 
Had he used it, it would indeed have led him to a different conclusion: to 
believe not-p rather than p. But in this particular case the first heuristic 
gets things right: p happens to be true. Does Maurice know p? Intuitively, 
no. He does not know, once again, because his belief in p ist not justified. 
He should have consulted the superior heuristic” (Goldman[69], p.Bf). 

7. The Gase of the Epistemically Serendipitous Lesion: “There is a rare but 
specific sort of brain lesion (we may suppose) that is always associated 
with a number of cognitive processes of the relevant degree of specifity, 
most of which cause its victim to hold absurdly false beliefs. One of the 
associated processes, however, causes the victim to believe that he has a 
brain lesion. Suppose, then, that S suffers from this sort of disorder and 
accordingly believes that he suffers from a brain lesion. Add that he has 
no evidence at all for this belief . . . Then the relevant type . . .will certainly 
be highly reliable; but the resulting belief — that he has a brain lesion — will 
have little by way of warrant for S” (Plantinga[123], p.l99). 

8. The Gase of the Helpful Demon: “Rene thinks he can beat the roulette ta- 
bles with a system he has devised. Reasoning according to the Gambler’s 
Fallacy, he believes that numbers which have not come up for long strings 
are more likely to come up next. However, unlike Descartes’ demon vic- 
tim, our Rene has a demon helper. Acting as a kind of epistemic guardian 
angel, every time Rene forms a belief that a number will come up next, 
the demon arranges reality so as to make the belief come out true. Given 
the ever present interventions of the helpful demon, Rene’s belief forming 
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process is highly reliable. But this is because the world is made to corn- 
form to Rene’s beliefs, rather than because Rene’s beliefs conform to the 
world” (Greco[75], p.286). 

In how far the reliabilist account of Goldman’s is able to handle these 
“counterexamples” depends on which version of Goldman’s reliabilism is used. 
We will now try to state explicitly what is really needed to see that the examples 
do not show what they are intended to show, or why the argument that is 
behind the example is actually irrelevant for Goldman’s approach, or at least 
for our own purposes. 

The Sweepstakes example only shows that the reliability account of 
justification is not sufficient for the kind of justification or warrant that is 
presupposed by our ordinary notion of knowledge. But it is not a problem 
for theories that intend to define justification simpliciter by reliability and 
that are interested in justification per se and not in knowledge - our theory of 
justified inference is such a kind of theory - because the person in the example is 
intuitively justified to believe that she will lose. Goldman is of course also aware 
of the fact that his reliabilist concept of justification has to be supplemented 
(e.g., by a relevant alternatives approach, which is also referred to as ‘local 
reliability’ - see Goldman[66]) if knowledge is to be attained. 

As far as the Dalai Lama example is concerned, we are not told all the 
relevant details. E.g., say, the person that believes in anything that the Dalai 
Lama says, does so, because his mother has told him to do so. But then this 
method of following the Dalai Lama blindly has been acquired by a second- 
order process, which might not be second-order reliable; e.g., his mother might 
be simply insane. Only if the second-order process is reliable, too, we would 
have a case of justification according to Goldman’s account (recall the remark 
on full justifiedness on p.330 from above). Here, second-order reliability enters 
the picture. 

BonJour’s examples are handled by Goldman by reference to the non- 
undermining condition expressed by (P3) on p.330: the beliefs generated by 
clairvoyance powers are undermined by count erevidence. Therefore, he needs 
a non-undermining condition for justified belief. From the point of view of 
Goldman[71] (and also [73]; see section 22.4), the clairvoyance-caused beliefs 
indeed turn out to be unjustified or nonjustified - depending on the details 
of the example version. E.g. in the case of the agent having counterevidence 
against the clairvoyance-caused belief, “the evaluator will match these agent’s 
belief- forming processes to the vice of ignoring contrary evidence. Since the pro- 
cesses include a vice, the beliefs will be judged to be unjustified” (Goldman[71j, 
p.159). 

The Humperdink example may be attacked similarly as in the case 
of the Dalai Lama example, since again a non second-order reliable process 
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has been used to acquire the “perfectly reliable” algorithm considered. The 
Millicent example is analogous to Bon Jour’s examples. 

Maurice is not justified in holding his belief since he violates the first 
principle expressed in section 22.1, i.e., it is not sufficient that a reliable process 
has been used to produce a belief, but it is also necessary that no other process 
has been available, which would have been even more reliable, but which would 
have produced a contradictory belief. 

Plantinga’s example is a case of a reliable first-order process being 
caused by an unreliable second-order process, namely, the process of injuring 
the relevant brain parts; of course, this is not really a cognitive process, but 
perhaps it is still relevant - it is a process having cognitive consequences^ since 
it “produces or modifies belief- forming processes or methods” . 

In the Case of the Helpful Demon, it is simply not true that “Rene’s 
belief forming process is highly reliable” , since reliability is meant to be the re- 
liability in the normal worlds. If this normal worlds account is not presupposed 
or even abandoned - as Goldman does in his more recent writings - then the 
example is indeed problematic for Goldmans ’s reliabilism (see the discussion of 
the “brain in a vat” and the “evil-demon problem” below). 

The next two examples seem to indicate that Goldman’s reliabilism is 
actually too strong: 

Example 206 (‘^Showing” That the Reliability Approach Is Not Necessary for 
Justification and/or Knowledge) 

1. Pollock presents the following ‘‘brain in a vat” example, which seems to 
show that reliability is not even necessary for justifiedness: “Harry . . .had 
his brain removed and wired into a computer that directly stimulated his 
visual cortex so that he had normal- seeming sensory experiences but they 
were totally unrelated to his physical surroundings. For Harry, perception 
became an unreliable cognitive process, and thus the reliabilist is commit- 
ted to regarding Harry’s perceptual beliefs as unjustified. But this seems 
wrong. Harry has no reason to suspect that anything is amiss, so if he 
takes reasonable care in forming perceptual judgments we will regard them 
as justified” (Pollock [125], p.llj). 

2. Goldman himself suggests a similar might-be counterexample to his re- 
liability approach: “Consider a possible world in which a Cartesian de- 
mon systematically deceives a certain cognizer . . . The cognizer employs 
the very same psychological processes that you or I use [and suppose we 
are epistemically permitted in using them], but the result is a massively 
false set of beliefs. Since his processes are not reliable, reliabilism im- 
plies that his beliefs are not justified. But intuitively they are justified” 
(Goldman[69[, p.llO). This problem is term^ed the “evil-demon problem” 
for reliabilism by Sosa[166[, p.l32. 
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Pollock’s “brain in a vat” example and Goldman’s demon example are 
of the same kind: the worlds described in both examples are non-normal worlds. 
Thus, those examples are indeed no counterexamples to the (ARI) criterion on 
p.332, because reliability is to be measured in normal worlds, and is therefore 
unaffected by brains in a vat and Cartesian demons. The processes described 
in both examples are simply not unreliable at all, i.e., they are indeed reliable 
in the normal worlds. But this defense only works if we use Goldman’s nor- 
mal worlds approach suggested in Epistemology and Cognition. However, if we 
understand reliability as the reliability in the actual world, and if the actual 
world happens to be one of the “strange” worlds described in the two exam- 
ples from above (although this seems to be improbable), then these examples 
indeed cause problems for a reliabilist approach. Since Goldman advertizes an 
actual world account of reliability in “Strong and Weak Justification” , he tries 
to handle counterexamples as the latter ones there by referring to the notion 
of weak justification (recall section 22.3): both examples would be instances 
of weak justifiedness (but not of strong justifiedness). In [71], p.l58, Goldman 
suggests to explain the intuitive justifiedness of the victim’s beliefs in the de- 
mon example by reference to the epistemic evaluator’s matching the victim’s 
cognitive processes to the items on his list of intellectual virtues, by which the 
evaluator may judge the victim’s beliefs to be justified. 

We have seen that Goldman’s reliability account seems to stand the 
alleged counterexamples quite successfully. We say ‘seems’ because the different 
brands of Goldman’s theories offer different kinds of replies to the examples, 
and because it might be that these replies are based on assumptions which 
are themselves not justified. Thus, let us consider three more systematic com- 
plaints which are raised against Goldman’s sort of reliabilism in the literature, 
and which are directed against some of the assumptions on which some of the 
variants of Goldman’s account rest, i.e.: (i) there is a clear notion of “the” pro- 
cess type (the process token of) which generates the belief that is to be assessed 
epistemically, (ii) we have a subjective concept of normal worlds which does not 
presuppose a theory of justified belief in itself, and which employs a clear and 
distinct notion of normality, and (iii) the reliabilist theory of justification that 
is suggested by Goldman explains the very notion of justification that is tra- 
ditionally dealt with in epistemology. Several authors have claimed that these 
assumptions are not well-founded, or simply false. We will shortly address their 
cutups in the following three subsections. But we will follow the discussion only 
to the extent that is relevant for the reliabilist theory of justified inference that 
we are looking for. 

22.5.2 The Generality Problem 

This problem affects the individuation of process types which is presupposed 
in the reliabilist explication of the notion of justified belief. Goldman himself 
describes the problem as follows: “A critical problem concerning our analysis is 
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the degree of generality of the process- types in question. Input-output relations 
can be specified very broadly or very narrowly, and the degree of generality will 
partly determine the degree of reliability. A process-type might be selected so 
narrowly that only one instance of it ever occurs, and hence the type is either 
completely reliable or completely unreliable. (This assumes that reliability is a 
function of actual frequency only.) If such narrow process- types were selected, 
beliefs that are intuitively unjustified might be said to result from perfectly 
reliable processes; and beliefs that are intuitively justified might be said to re- 
sult from perfectly unreliable processes” (Goldman[67], p.ll5). The problem is 
that, according to Goldman, the justifiedness of states of belief is to be evalu- 
ated in terms of a property of processes^ i.e., entities of a different ontological 
category. But it is not clear how to relate states of belief to their generating 
processes in some unique way. First of all, there may be several process tokens 
which might be said to cause the belief token concerned; secondly, these process 
tokens may be the instantiations of a great variety of process types. Which of 
the latter is the one that would be referred to in the justification condition 
that is stated in “What is Justified Belief?”? The same problem also affects all 
other formulations of Goldman’s reliability account. 

Goldman has given preliminary suggestions of how to solve the general- 
ity problem (see e.g. [69], pp. 49-51), but in the relevant literature the problem is 
regarded as unsolved or even unsolvable, and as weighing heavily against Gold- 
man’s account (see Sosa[166], p.l31; Pollock [125], pp. 117-118; Plantinga[123], 
pp.l98f, pp.202f; Greco[75], pp. 286-289). 

22.5.3 Problems of Defining Reliability 

Goldman has been plagued by the question of “where” to measure reliability, 
and thus, of how to define the reliability of processes in a way which would be 
adequate for his reliabilist theory of justified belief. Pollock[125], p.ll6, states 
a tightly related question: “Relative to which set of circumstances are we to 
judge reliability?” 

Among the alternatives considered by Goldman are: the actual world, 
worlds close to the actual world, and normal worlds. 

The (close to the) actual world approach has led to an amendment of 
Goldman’s account in form of the concept of weak justification. 

The normal worlds approach hgis been applied successfully in subsec- 
tion 22.5.1, but has been abandoned by Goldman for the reasons stated in sec- 
tion 22.3 (similar objections have been pointed out by Plant inga[ 123], pp.203f, 
and Pollock[125], p.ll5). Additionally, Goldman’s subjectivist notion of nor- 
mal worlds, which brings his theory indeed very close to an internalist theory 
of justification**, may be critizised for a further reason: normal worlds have 

**This is denied by Goldman[69], p.l09: “Does the current stratagem [i.e., normals worlds 
are the worlds which satisfy our general beliefs] turn my theory of justifiedness in to a sub- 
jectivist account rather than an objectivist one? I think not. . . Whether a given rule system 
is right does not simply depend on what we believe about the world. It depends on whether 
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been defined by Goldman as those satisfying the general beliefs that we actu- 
ally have. Let us assume that these beliefs have been caused by certain basic 
psychological processes: the latter processes are then a fortiori perfectly reliable 
for the simple and trivial reason that they lead to beliefs which are true in the 
normal worlds, since the latter beliefs define what the normal worlds are; thus, 
by applying the (ARI) criterion, all of our general beliefs would be justified. But 
this just seems to be skepticism reversed to its opposite and equally abhorrent 
extreme. 



22 . 5.4 Problems of Subjective Justification 

The final reason why Goldman’s reliabilist account of justficiation has been 
rejected by several authors, is that “there is a powerful intuition that knowl- 
edge does require that the knower have some kind of sensitivity to the relia- 
bility of her evidence. Sometimes this intuition is expressed by insisting that 
knowledge requires subjective justification” (Greco[75], p.285); according to 
Plantinga[123], pp.204f, “the main and crucial problem is that the [i.e., Gold- 
man’s] suggested necessary and sufficient condition of warrant is vastly too 
weak to be anywhere nearly sufficient. Justification^ for Goldman, isn’t quite 
warrant; it is not the case that a sufficient degree of justification is (together 
with truth) necessary and sufficient for knowledge.” Pollock[125], p.ll3, even 
states: “reliability has nothing to do with epistemic justification.” 

22.5.5 Differences between Goldman’s Accounts and Our Account; How the 
Problems Are Avoided 

Although Goldman’s accounts of a process-reliabilist theory of justified belief 
has been called a ‘paragon’ for our low-level process-reliabilist theory of justified 
inference, there are some major differences between both of them, including: 
(i) Goldman does not develop a reliabilist theory of justified inference but 
only of belief; (ii) the format of our theory of justified inference differs from 
the format of Goldman’s theories of justified belief (e.g. to [69] with respect 
to the latter’s reference to rule systems)'^ (iii) we do not demand any non- 
undermining condition for justifiedness like in Goldman[67], [69], since in the 
case of justified inference it would not even be clear what such a condition might 
look like; (iv) we have no no-other-more-reliable-process-is- available condition 
(as referred to in section 22.1): such a condition would make no sense since 
we aim at the justification of (inference) processes themselves; (v) there is no 
counterpart in Goldman’s theories to our low-level postulate; (vi) Goldman does 
not distinguish between quantitative and qualitative notions of reliability; (vii) 
within our theory of justified inference we suggest to measure the reliability of 
an inference either by its objective probability of attaining truth, or by its degree 
of attaining truth in the objectively normal worlds. Both of these versions seem 

the processes permitted by the specified rule system really do have a high enough truth ratio 
in normal worlds. This is something that hinges on the ‘facts’ about these processes, not 
simply on our opinions. Hence, the proposal is objectivist, not subjectivist, in contour.” 
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to avoid Goldman’s problems of defining reliability by employing a subjective 
notion of reliability, at least as far £is our intended low-level theory of justified 
inference is concerned; (viii) Goldman does not develop his theory in formal, 
system- theoretic details. 

The generality problem has been discussed regarding its consequences 
for our own theory in sections 5.1, 6.1, 6.3, and finally in section 8.4. The prob- 
lem of subjective justification is irrelevant for our theory since we presuppose 
the low-level postulate stated in section 6.2. 




Chapter 23 

A SKETCH OF LOGIC PROGRAMMING 



Let us briefly repeat some of the basic definitions to be found in the standard 
literature on logic programming. We will use the excellent introductory article 
by Lifschitz[95], where all the notions and results given below are stated. A 
standard textbook reference on logic programming is Lloyd [96]. 

Definition 207 Let P he a set of propositional variables (positive literals). 
Negative literals are negated propositional variables. Literals are positive or 
negative literals. Let Lit be the set of literals (given relative to P). A rule 
element is a literal possibly preceded by the negation- as- failure symbol not. If X 
is a set of literals, let not{X) = {not n\n ^ X}. X is inconsistent, if3n, ->n G 
X (consistent, otherwise). X is logically closed, if it is consistent or equal to 
Lit. 

Definition 208 A rule is an ordered pair Head <— Body, whose first member 
Head is a literal, and whose second member Body is a (possibly empty) finite set 
of rule elements. A basic rule is a rule, whose body is a set of literals. A rule can 
be represented as Head ^ Pos U not(Neg) for some finite sets of literals Pos, 
Neg. The rule with the head no and the body {ni, . . . , n^, not n^+i, . . . , not n^+j} 
is often written as: 



7%q ^ n]^ , ... , rt'i , Ti/Ot n^j-j- , ... , Thot n2-|- j . 

Definition 209 A program is a set of rules. A basic program is a set of basic 
rules. 



In the following ‘X’ will range over sets of literals. 

Definition 210 Let B be a basic program. 

X is closed under B, if for every rule Head Body eB we have that 
{Body CX ^ Head e X). 

Cn{B) is the smallest set of literals which is both logically closed and 
closed under B (such a set always exists). The elements of Cn(B) are called 
the consequences ofB. 
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Now we turn to programs in general and thus to negation as failure: 

Definition 211 Let II be an arbitrary program. 

The reduct of II relative to X is the basic program obtained from II by 

• deleting each rule Head <r- Pos U not(Neg) s.t. Neg H X ^ 0, and 

• replacing all remaining rules Head ^ Pos U not{Neg) by Head ^ Pos. 

is the resulting program. 

Definition 212 X is an answer set for a program 11^ if Cn{U^) = X (note 
that is a basic program). For II being an arbitrary program, we define 
Cn{U) as the intersection of all answers sets for II. For basic programs this is 
a conservative extension of the old definition. 

Definition 213 X is closed under a program 11^ if for every rule Head <— 
Pos U not{Neg) we have that {Pos C X, Neg D X = 0 ^ Head E X). 

Proposition 214 Every answer set for a program II is closed under II. 

Definition 215 A program U is hierarchical, if there is a level (rank) mapping 
rk : Lit Ord from literals to the class of ordinals, s.t. for every rule Head <r— 
Pos U not(Neg) G II which is not bodyless: 

rk{Head) > max rk{n). 

n^PosUNeg 

Proposition 216 Hierarchical programs have at most one answer set. 

Definition 217 A rule element, rule or program is normal, if it does not con- 
tain the classical negation symbol -i. 

Proposition 218 Normal hierarchical programs have exactly one answer set. 
A rule 

?7/q ^ n\ , . . . , nj^ , not , . . . , not nui 
may be identified with the default 

n\ A ... A n]^ . , . . . , ^n^yi 

no 



In this way programs are identified with default theories of a special 
syntactic form. 

Let the extensions for default theories be defined as usual, and let the 
consequences of default theories be defined as the members of all extensions. 
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Proposition 219 For every program II; 

• if X is an answer set for II then dc{X) is an extension for II (after 
translating as sketched above), 

• every extension for II is the deductive closure of an answer set for P, and 
this answer set is determined uniquely. 




Chapter 24 

PREFERENTIAL INTERPRETED INHIBITION NET 
AGENTS AND THE SYSTEM P 



In the subsequent chapters we will state and prove representations theorems of 
a similar kind as in chapter 16, but now for different systems of nonmonotonic 
logic. 

The contents of this chapter and also of chapters 25-27 can be found 
in more abbreviated form in Leitgeb[91]; section 7 of [91] relates these contents 
to logic programming again, but we are not going to deal with this relationship 
in our remaining chapters. 

24.1 Two Important Kinds of Sets of Nodes within Inhibition Nets 

In the next section, we will define the notion of a preferential partially inter- 
preted inhibition network agent, and we will also present two subclasses of 
preferential partially interpreted inhibition net agents which play a role for the 
semantics of the system P (recall chapter 10). We are going to use the notions 
and definitions of chapter 15 (the same holds for the subsequent chapters). It 
will be shown that the second class, i.e., the class of preferential partially inter- 
preted net agents which are antitone^ is precisely the class of net agents, which 
are disposed to draw inferences while obeying the rules of the system P. The 
first class, i.e., the class of preferential partially interpreted net agents which 
are odd, will be proved to be a proper subclass of the second class. The property 
of being antitone depends on the way the closure operator “behaves” in such 
networks; the property of being odd will be defined more directly by stating a 
constraint on the topology of networks. That is the reason why the property of 
being odd is interesting in itself, although the system P is not complete with 
respect to the class of preferential odd partially interpreted net agents but only 
sound - contrary to the class of preferential antitone net agents relative to 
which P is both sound and complete as we will prove later. 

In order to be able to define in the next section, what odd, or anti- 
tone, preferential partially interpreted inhibition network agents are, we have 
first to specify two auxiliary notions: the notion of a set of nodes being odd in a 
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network, and the notion of a set of nodes being anitone in a network. We will 
also prove some of their properties: 

Definition 220 (Odd) 

Let X — (AT, /, bias) he an FHIN. Let N C N (later, we will always 
assume that bias E N). 

N is odd in X iff 

there is no path nj, . . . , , ttq, . . . , , . . . , ng, . . . , in X with 

• nj G TV, nj 7 ^ bias, G N 

• for alii e {1,..., r}, j e {0,. .. ,ki - 1}: n] E 

• for a// i G {1, . . . , r — 1}; there is an m £ N , s.t. n\, I (m, 

• and r — 1 is even. 

See fig. 9 for a “forbidden” path (if N is odd in X, and if the first and 
the last node in the path are members of TV): 

.... 

Fig. 9: A “Forbidden” Path for Odd Subsets 

Remark 221 Def.220 implies that, if N is odd in X, there are no no, ni E N 
with no 7 ^ bias, s.t. no E n\; this says, roughly, that a node in an odd set N must 
not have any direct, excitatory influence on other nodes in N . More generally, 
by def.220, a node in an odd set N must also not have any indirect excitatory 
influence on other nodes in N via an even number of inhibitions, s.t. the last 
inhibitory connection blocks a path originating from the bias node. Let me draw 
an analogy: suppose, for some reason, we want to exclude “directly positive” 
formulas a, i.e. formulas without negation (but just with, say, disjunction), 
from a propositional language; then we will usually also want to exclude the 
“indirectly positive” formulas of the form -i-ia, -i-i-i-io;,. . ., where a is “directly 
positive”, since the latter are logically equivalent to the former. In a similar 
way, direct and indirect excitatory connections between nodes of odd sets are 
excluded. 

The definition of oddness of sets in a network somehow resembles the 
definition of odd dependence in logic programming, which is usually introduced 
in order to define notions such as call- consistency by which odd dependence of a 
predicate on itself is excluded (see any standard introduction to logic program- 
ming). In our case, odd sets prove useful for the study of hierarchical networks, 
i.e., for a context where cycles are disregarded from the start. We cannot offer 
any results which would indicate to what extent theorems on odd dependence 
can be translated into theorems on odd subsets. 




Two Important Kinds of Sets of Nodes within Inhibition Nets 



351 



Corollary 222 Let! = (N,E,I,bias) be an FHIN, Let N C N, s.t. N is odd 
in X. 

If there are nodes n\, U2, G N, s.t. U2 lies on a path which leads 
from ni to n^, then either ni = U2, or U2 = n^. 

Proof: 

^ ^2? CL'nd U2 7^ ns (ni ^ Us since X is an FHIN), it follows from 
N being odd in X that there is a path from ni to U2 having an odd number of 
inhibitions, and that there is a path from U2 to ns having an odd number of 
inhibitions. By concatenating those paths, there is a path from n\ to ns with 
an even number of inhibitions, which contradicts N being odd in X. ■ 

Lemma 223 (N- ‘distribution” I) 

For every FHIN X = {N, E, I,bias) , for every N C N s.t. bias G N, 
N is odd in X, and for all states si,S2 Q N: 

N n Cl{si) n CI{S2) cNn Cl{si n 52). 



Proof: 

Let n\ G N : assume that Cl{si){ni) = 1, Cl{s2){ni) = 1 , and, for 
contradiction, suppose that Cl{si fl 52) (ni) = 0 . 

1. In this case n\ ^ s\ C\ S2, and thus we may assume that ni ^ si, without 

loss of generality. But since Cl{si){ni) = 1, and since X is finite, it 
follows that there is a (not necessarily unique) E-path m^, . . . of 

maximal length, s.t. Cl{si){ml) = 1 , = n\, and there is no node m 

s.t. Cl{si){m) = 1, m I for some i G { 0 , . . . , — 1 }. Since 

N is odd in X, mj cannot be a member of N \ {bias}. Thus, also mj ^ 
si\{bias}. Moreover, there is no node m s.t. m E m^, and Cl{si){m) — 1, 
because this would contradict the maximality of the path 
Therefore, because of Cl{si){mQ) = 1, it follows that mj = bias. 

2 . Cl{si n 52 )(ni) = 0 implies that there has to be a node n2, s.t. Cl{si Pi 

S 2 ){n 2 ) = dnd U 2 I for some ii e [0, ... - 1}. U 2 ^ 

ni, forX being hierarchical. From above we know that Cl{si){u2) = 0 , 
therefore U2 ^ si, and so we also have that U2 ^ 5 i ns2, and ri2 7^ bias. It 
follows again that there is a (not necessarily unique) E-path ttIq, . . . , 

of maximal length s.t. Cl{sins2){m(f) = 1 , = U2, and there is no node 

m s.t. Cl{si n S 2 ){rn) = 1, m I (mf , for some i G { 0 , . . . , /c2 — 1 }. 

Since U2 is not necessarily a member of N , we cannot simply infer again 
that mg = bias. But for the maximality of the selected path, we know at 
least that G si H S2, and thus also si(mo) = 1 . 

Cl{si){n2) — 0 therefore implies that there has to be a node s.t. 
Cl{si){n3) = 1 , and I for some i2 G { 0 , . . . , ^2 - 1 }. 
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^3 7 ^ ^1? ^3 7 ^ ^2? since Z is hierarchical. Suppose, for contradiction, that 
ns G Si: then ns E N, ns ^ bias because Cl{si H S2)(^3) = 0 ; ci'^^d there 
is a path uq, . . ^ ,Uk in Z with uq = ns (thus uq G N), ui = . ., 

Uk2-i2 = ‘^k2> '^k2-i2+i = Uk = ml^ (thus Uk € N), s.t. uq I 

(rn(^,u\), Uk2-i2 I Wfc2-i2+i)j between the rest of the nodes in 

the path, there are excitatory connections. But this contradicts N being 
odd in Z. Therefore, ns ^ si. 

S. Now, we are in a similar situation, as we have been at the stage of proof 
item 1 : ns ^ s\, Cl{si){ns) = 1 , Cl{si H 52 )(n 3 ) = 0 . There has to 
be a (not necessarily unique) E-path . . . , of maximal length, s.t. 
C/(si)(mo) = 1 ; = ns, and there is no node m s.t. Cl{si){m) = 1 , 

m I (rn^, ^?+i) some i G {0, . . . , — 1}. Since N is odd in Z, 

cannot be a member of N \ {bias}, because otherwise we can find a path 
from mg to n± with an even number of inhibitions. So it follows again 
that mg = bias. Extending the argument, analogously as above, it follows 
that there is an infinite sequence ni,n2,n3, . . . of pairwise distinct nodes 
in N , contradicting the finiteness ofZ. 

Therefore, Cl{si n 52)(ni) = 1. ■ 

If Cl has the property that for every N C N s.t. bias G N, for all 
states si,S2 C N, 



N n Cl{si) n Cl{s2) cNn Cl{si n ^2) 

it will be called to satisfy ^{N-) distribution’ (corresponding to the distribution 
property of closure operators for sets of formulas; see Makinson[99], p.47). 

Definition 224 (Antitone) 

Let Z — {N, E, I, bias) be an FHIN. Let N C N (later, we will always 
assume that bias ^ N). 

N is antitone in Z iff 
for all n ^ N the mapping 

p(iV\{n}) ^ {0,1} 

s 1-^ Cl{s){n) 

is antitone, i.e., for all si, S2 ^ p{N \ {n}); if s\ C S2 then En{si) ^ 

En{s2)^ 



Now we will state some equivalent reformulations of def.224, and also 
one of its implications: 
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Corollary 225 Let I = {N, E, /, bias) be an FHIN. Let N CN. 

1. N is antitone in X iff 

for all n e N, for all S2 G p{N \ {n}); 

if Cl{s2){n) = 1 then for all si s.t. si C S2, it holds that Cl{si){n) = 1 

2. For N antitone in X, for all n G N: if there is an s E p{N \ {n}) s.t. 
Cl{s){n) = 1, then Cl{{bias}){n) = 1 

3. N is antitone in X iff 

for all X C N^ for all si, S2 ^ p{N \ X): 
ff Si C S2 then Cl{s\) fl X D Cl{s2) fl X 

J^. In def.224, we could equivalently demand the mapping 

F'^: p{[NnIn{n)]\{n}) ^ {0,1} 

s Cl{s){n) 

to be antitone, where In{n) is the set of nodes, from which there are paths 
to n (i.e., which may have ^‘causal influence” on n). This makes it easier 
to check whether a set N is antitone in X. 

Proof: 

1. straightforward; 

2. this follows from claim 1, since {bias} is a subset of every state; 

3. assume that N is antitone inX, let X C N , si, S2 ^ p{N\X), and 
Si C S 2 . If Cl{si) nX ^ Cl{s2) nX, then there is an n ^ Cl{s2)f^X, s.t. 
n ^ Cl{si)C]X . Since n £ X, it follows that 52 G p{N\{n}). But because 
Cl{s2){n) = \, and since N is antitone in X, we have that Cl{si){n) = 1 
(by claim 1 of this theorem), and therefore n G Cl{si) fl X, which is a 
contradiction. 

assume the property stated in claim 3 on the right hand side of the 
equivalence sign, and suppose for contradiction that N is not antitone 
in X. By 1, there is an n e. N , an S2 G p{N \ {n}), and an si with 
Si Q S2, s.t. Cl{s2){n) = 1, and Cl{si){n) = 0. Now we simply set 
X := {n}, and then we have: {n} Q N , si, S 2 ^ p{N \ X), si C 52 , but 
Cl{si) n X — 0 ^ {n} = Cl{s2) n X, which is a contradiction. 

4- This follows from the fact that, for s G p{N \ {n}), Cl{s){n) does not 
depend on the values of s for nodes outside of In{n). ■ 
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Lemma 226 (N- ‘‘Distribution” II) 

For every FHINX = (A/", /, bias), for every N C N : if N is antitone 

in X, then for all states si,S 2 Q N: 

N n ci{si) n Ci{s2) c TV n ci{si n ^2). 



Proof: 

Suppose that N C N is antitone in X. 

Let n G TV n Cl{si) H Cl{s 2 )‘ We distinguish the following two cases: 

1. n ^ Si, n ^ S 2 - But then also n G si H 52 ^ Cl[s\ Pi S 2 ), and thus 
n G TVnC/(5i ns 2 ). 

2. n^si,orn^S 2 - Without restricting generality, assume thatn ^ S 2 . Now 
we have that n £ N , S2 ^ p{N \ {n})^ Cl{s 2 ){n) = 1 , si D S2 E S2, and 
Cl{si){n) = 1. By claim 1 of corollary 225, it follows that Cl{sins 2 ){n) = 
1, and therefore n £ N H Cl{si (1 S 2 ) . H 

Antitonicity does not only imply distribution but also vice versa: 

Lemma 227 (N- “Distribution” III) 

For every FHINX = {N, E, I,bias) , for every N C N: if N is not 
antitone in X, then not for all states si,S 2 X TV; 

TV n ci{si) n ci{s 2 ) c TV n ci{si n S 2 ). 



Proof: 

Suppose that N C N is not antitone in X. 

Then there is an n £ N , and there are si, S 2 ^ p{N\{n}) with Si C S 2 , 
s.t Cl{si){n) ^ Cl{s 2 ){n). Therefore, Cl{si){n) — 0; Cl{s 2 ){n) = 1. So we 
have that n £ Cl{si U {n}), n G Cl{s 2 ), thus also n £ Cl{si U {n}) fl Cl{s 2 ), 
but simultaneously n ^ Cl{[si U {n}] Pi S 2 ) = Cl{si), which contradicts N- 
distribution. ■ 

Lemmata 226 and 227 show that TV-distribution corresponds precisely 
to TV being antitone in X. It follows: 

Corollary 228 For every FHIN X = {N, E, I,bias) , for every N C N s.t. 
bias £ TV; 

if TV is odd in X, then TV is antitone in X. 

Proof: 

According to lemma 223, if TV is odd in X, X satisfies N -distribution. 
But because of lemma 227, ifX satisfies N -distribution, TV is antitone inX. ■ 

Remark 229 It is not dijficult to show (by counterexamples) that the other 
direction, i.e., from being antitone to being odd, is not necessarily the case. 
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24.2 The Network Semantics for Pref. Partially Interpreted Inh. 
Net Agents 

We can now introduce a network semantics for restricted FHINS of the above 
form; note that we strenghten the postulates for the interpretation mappings 
3^'^ accordingly: 

Definition 230 (Preferential Partially Interpreted Inhibition Network Agents) 
A preferential (^partially interpreted inhibition) network agent is 
a sextuple iV) defined analogously to def.153, but also with 

some important differences: 

1. there is a distinguished non-empty subset N of N, s.t. Cl{{bias}) ^ N 
and N is antitone in X 

2. : C p{N) 

with = {bias}, = N, and where the following postulates 

are satisfied: 

(a) let rn 3 ^,o = {p eC\3^^^{if) = {bias}}: 

for all (f, 'ip ^ C: ifTH^c^o \- p ^ then T’^((p) 2 3^'^{xp) 

(b) for all p, 'ip ^ C: 3"^’^((p A^ip) = 3"^^^((p) U 3^’^('ip) 

(c) for all ip E C: y 'ip) — fl 3^'^{'ip) 

(d) for all ip G C: = N \ 3^'^{p) 

(e) for all p G C: bias G 3^'^(p). 

In the case of a partially interpreted network, the nodes contained in 
N\N may considered to be auxiliary “inter-neurons” , without any representa- 
tional function. The nodes contained in N might be called ‘cognitive’. We will 
call a preferential network agent 31 = (^Sys,X, g,3^ ,3^'^ , N) odd if and only if 
N is odd in J. 

The additional constraints on 3^'^ which we have imposed on pref- 
erential network agents should not be mistaken for implying such obviously 
counter-intuitive postulates like: if 31 believes in s that [pV'ip is true], then she 
believes in s that [p is true] or she believes in s that [ip is true]; if 31 does not 
believe in s that [p is true], then she believes in 5 that [-^p is true]. The latter 
are indeed not entailed. 

We can associate theories of conditionals with preferential partially 
interpreted inhibition network agents in the same way as in the cumulative- 
ordered case of chapter 16. We can define: 

Definition 231 (Preferential Network Semantics) 

1, a[x] ^ P[x] E C=^ is called preferential-net-valid iff 
for every preferential net agent 01; 01 N a[x] => P[x] 
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2. let KB^ C let ^ = (^Sys^X, g, 3^, N) be a preferential partially 
interpreted inhibition network agent: we say that 

iff 

for every a[x] ^ /3[x] G KB^ it holds that N a[x] l3[x] 

3. let KB^ C C=^, let a[x] ^ (3[x] G C=^: 
we say that 

KB^ a[x] (5[x\ 

(KB^ preferential-net-implies a[x] => P[x]) iff 

for every preferential net agent if^\= KB^, then N a[x] P[x]. 

We loosely refer to the notions defined in this section by the term 
‘preferential net semantics’. 

24.3 The Representation Theorem for P 

24-3.1 The Soundness Lemma for P 

The soundness of the rules of C (recall chapter 16) translates into the soundness 
of the corresponding rules of P. Only Or remains to be shown sound: 

Lemma 232 Let iV) be a preferential partially inter- 

preted inhibition network agent; let a, j3, 7 G £: 

^ satisfies 

Or: if^ N a[x] => ^[x], 01 1= P[x] => ^[x], then 01 ^ a[x] V P[x] ^ 'y[x]. 
Proof: 

By assumption, C Cl{3^^^{a)), C Cl{3^'^{(3)); thus 

c n CI{3^^%P)), and since Cl{3^^%a)) H Cl{3^^^{p)) C 

C/(T’°(a)nT’‘^(/3)) by lemma 226, and Cl{3^^^{a)n3^^%P)) = Cl{3^'^{a\/ (3)) 
by def.230, we are done. ■ 

This implies: 

Lemma 233 (P-Soundness I) 

Let 01 = (^Sys,X, g,3^ ,3^^^ , N) be a preferential partially interpreted 
inhibition network agent: 

thenTTL^iffl) is a consistent conditional P-theory extending {31) . 

24-3.2 The Completeness Lemma for P 

Lemma 234 (P- Completeness I) 

Let C be a theory, i.e. deductively closed: 

for every consistent conditional P-theory T1~L=^ C extending 
there is a preferential partially interpreted inhibition network agent 

s.t. 
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• and 

• TH=^ = TH=^{Tl), i.e. for every a[x] p[x] G 

a[x] => l3[x] G TH^ iffTl\= a[x] f3[x]. 



Proof: 

By theorem 106 stated in chapter 11 (and proved by Kraus et al.[85], 
pp, 193-196) for every as above there is a finite preferential model 9Jtp = 

(aS , (based on the set W of worlds satisfying TH^), s.t a[x] P[x] G 
TH=^ iffdJlp N a[x] P[x], i.e. all states minimal with respect to which 
make a true, also make P true. In the following, we use to construct the 
intended partially interpreted network Ot. We use ^s ’ with or without index to 
range over states in the sense of preferential models, and ‘s’, as usual, to range 
over net states. 

For every s e S let Ls = {s' e S\s' ^ s}; say, Lg = {si, . . . , s^-}. 
Furthermore, let f\Ls be a conjunction subnetwork for the nodes si,. . . ,Sr^; 
i.e., there is a node s\ A ... /\ Sr^, the activity of which matches the Boolean 
conjunction of the activity states of si, ... , Sr-^ (compare theorem 134 chapter 
14)’ If s is minimal in -<, then let f\Ls be the empty subnet. Now we define: 

let N = {bias} joined with the set of nodes of the conjunction subnet- 
work f\ Ls for each s e S. 

Let E = {{bias,s) |s is not minimal according to joined with the 
set of excitatory connections within the conjunction subnetworks f\ Lg for each 
seS. 

Moreover, for non-minimal s e S, and Lg = {si, . . . , Sr^}, 
let Ig — {(si A ... A Sr-, (biases))}. If s is minimal in -<, then let Ig = 0. 
Let I Ig joined with the inhibitory connections within each conjunction 

subnetwork. Obviously, I C N x E. 

LetX = {N, E, I ,bias) . Since X does not contain any cycles, and since 
X is finite (because S is), X is an FHIN. 

Let N = {bias} U S. As a consequence of our def. of X, our def. of 
N, we see that N is antitone in X. This follows because for all s G N, for 
all S 2 G p{N \ {s}); if Cl{s 2 ){s) = 1 then s has to be excited by the bias 
node via (biases); but this is only possible if Cl{s 2 ){si A ... A Sr^) = 0 (for 
Ls = {si, . . . , Sr^}). Thus there has to be a node within Lg, which is not active 
in Cl{s 2 ). Now, for all s\ s.t. si C S 2 , it also holds that there is a node within 
Lg which is not active in Cl{si), because if every node in Lg were active in 
Cl{si), there would be a node s* G Lg, s.t. s* ^ Cl{s 2 ), but s* G Cl{si). s* 
cannot be a member of layer 0, because si C 52 . Thus s* has to be excited in 
Cl{si) by the bias node via (bias,s*); but this is only possible if Cl{s\){s\ A 
... A s*_ ) = 0 (where Lg* = {sj, . . . , s*—} ). Therefore, there has to be a node 
s* within Lg* which is not active in Cl{si). But s* is also a member of Lg, 
which contradicts our assumption that every node in Lg is active in Cl{si). So 
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we have that there is also a node within Lg which is not active in Cl{si). Thus, 
Cl{si){si A ... A Sr^) = 0^ therefore Cl{si){s) = 1. So, N is antitone in X. 

Now we define for (p G C: — {6ia5}U{5 |s does not make p true}. 

Obviously, is an interpretation mapping satisfying the postulates of def 230. 
Define N^, 3^, and g as expected, s.t. 3^^^{p) = {bias} U g{3^{p)). X, g, 3^, 
3^'^ , N determine a system Sys, s.t. 31 = (^Sys,X,g,3^ ,3^'^ is a preferen- 
tial partially interpreted and antitone inhibition network agent. 

Now we show that N a[x] /3[x] iff 31 a[x] ^ (5[x], which 
immediately entails: a[x] P[x] G TTt=^ iff3l^a[x] ^ P[x]. 

Let a ^ C. We will prove by induction that s ^ S does not fire in the 
net state Cl{3{a)) iff s is a minimal a-state according to 33V^. Let {Nq, . . . ,Nk) 
be the canonical partition of X as usual. 

• Induction basis: 

let s e No (s ^ bias since seS); 

Cl{3^^^{a)){s) = 0 iff s ^ 3^'^{a) iffs is an a-state. Moreover, every 
state s in Nq is minimal according to -< by def. of E and I. 

• Induction step: 

assume that for every s G A^o U . . . U Cl{3^'^{a)){s) = 0 iff s is a 
minimal a-state. Now consider an arbitrary s G A^i+i.’ 

C/(3^’^(a))(5) =0 iffs^ 3^^^{a) and -i3m G Nj with j < i s.t. 
{Cl{3^'^{a)){m) = l,m E 5, ->3m' G Nu with u < = 1, 

m' I (m,s))). But this is the case if and only if s is a minimal a-state, 
for the following reasons: 

first, s ^ iffs is an a-state; at second, by def. of E and I, -^3m G 

Nj s.t. 

j < i and Cl{3^^^{a)){m) = l,m E s, -^3m' G Nu with u <i s.t. 

{Cl{3^^^{a)){m') = l,m' I {m,s)) iff 

Vs' G Lg it holds that Cl {3^^^ (a)) (s') = 1 iff 

(by induction hypothesis) Vs' G Lg it holds that s' is no minimal a-state. 
But s is an a-state and Vs' G Lg (s' is no minimal a-state) iff s is a 
minimal a-state (by the Smoothness Condition). Therefore, we have that 
Cl{3^^^{a)){s) = 0 iff s is a minimal a-state. 

We know that 3)1^ N a[x] => f3[x] iff all minimal a- states are (3 -states. 
But the latter is the case, if and only if for all s G S: z/C/(3^’^(a))(s) = 0 then 
s ^ 3^^^{P), or equivalently, for all s G S: ifs G then Cl{3^^^{a)){s) = 1. 

So, 3Jlp N a[x] ^ (3[x] iff3l\=^ a[x] ^ P[x]. ■ 

Remark 235 Note that the class of networks which are constructed in the 
way of the proof of lemma 234, actually a proper subclass of the class of all 
antitone networks (as can be seen from constructing a counterexample). 
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Theorem 236 (P- Representation) 

Let TH-^ C C-^ he a theory: 

TH^ C C=^ is a consistent conditional P-theory extending T7i^ iff 
there is a preferential partially interpreted and antitone inhibition net- 
work agent Tl = l^Sys^X^g,'3^ ,3^'° ,N) , s.t. TH-^{Tt) 2 and TH^ = 

Corollary 237 Let C he a theory, a[x] P[x] G KB^ C 

1. a[x] => (3[x\ is true in all preferential partially interpreted antitone inhi- 
bition network agents, s.t. 2 TH-^ 'Iff for all consistent condi- 

tional P-theories TH=^ extending TH^: a[x] 0[x] G TTt^ 

2. a[x] => l3[x] is preferential-net-valid iff for all consistent conditional P- 
theories TTL^ extending the ded. closure dc({T}) of {T}: a[x] => j3[x] G 
TH^ 

3. (P-Soundness/P- Completeness II) 

a[x] => j3[x\ is preferential-net-valid iff a[x] (5[x\ is P-provable (rel. to 
dc{{T})) 

f. KB^ a[x] ^ (}[x] iff for all consistent conditional P-theories T7i^ 2 
KB^, s.t. TH^ extends dc({T}); a[x] => P[x] G TB^ 

5. (P- Soundness/P- Completeness III) 

KB^ a[x] p[x] iff KB=^ ^ 

Proof: 

1. Apply theorem 236; 

2. apply theorem 236 to 1 in definition 231; 

3. use 2 from above; 

4 . apply theorem 236 to 3 in definition 231; 

5. analogously to 5 in corollary 171. M 




Chapter 25 

CUMULATIVE INTERPRETED INHIBITION NET AGENTS 
AND THE SYSTEM C 

25.1 The Network Semantics for Cumulative Interpreted Inhibition 
Net Agents 

Now we do not any longer presuppose that net agents are based on FHINs, 
but we generalize to networks where the closure states are “input-determined” 
(FHINs are just a special case): 

Definition 238 (Cumulative Interpreted Inhibition Network Agents) 

A cumulative ( interpreted inhibition ) network agent ^ is a quintuple 
(5ys,T, where 

1. Sys = (5^, S'", nc^ no) is a system 

2. X — (N^E^I^bias) is a finite inhibition net, s.t. Cl{{bias}) ^ N 

3. Sys has X under g as its central subsystem 

\ C p{N^), where is the set of nodes of the perceptual subsystem 
of Sys 

5. T’" : C p{N), s.t = { 6 ms} U g{3'P{p)) 

and the following postulates are satisfied: 

(a) 3^’"(T) = {6ms} 

3"’"(_L) = N 

(b) let T'H 3 c,o = [p ^ C = {bias}}: 

for all p, 'ip E C: ifTH^c^o \- p xp then D T’"('0) 

(c) for all p, \p E C: 3^'^{p /K'lp) = U T’"('0) 

(d) for all p E C: bias E 3^'^{p) 

6. for all p E C: 

there is a state Cl{3^^^{p)) which is stable under the input 3"’" ( 9 ), s.t. 
for all s E S there is an i eN with = C/(T’"(v?)). 
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Conditional theories are defined as before. We can define validity and 
ent ailment as follows: 

Definition 239 ( Cumulative Network Semantics ) 

1. a[x] ^ P[x] E is called cumulative-net- valid iff 

for every cumulative net agent a[x] P[x] 

2. let KB^ C let = {Sys,X, g,3^ be a cumulative interpreted 

inhibition network agent: we say that 

^ N KB^ iff 

for every a[x] ^ P[x] G KB^ it holds that 31 1= a[x] => P[x] 

3. let KB^ C C^, let a[x] ^ /3[x] G 
we say that 

KB^ o[x] f3[x] 

(KB^ cumulative-net-implies a[x] ^ P[x]) iff 

for every cumulative net agent if 31^ KB^, then N a[x] ^ (3[x]. 

We loosely refer to the notions defined in this section by the term 
‘cumulative net semantics’. 

25.2 The Representation Theorem for C 

25.2.1 The Soundness Lemma for C 

For the proof of the soundness result we again need a lemma first, which ex- 
presses cumulativity. Note that the method used to prove this lemma is different 
from the method we have used to prove the same lemma for the case of CL, 
but that the new method would also be applicable for CL: 

Lemma 240 Let = {Sys,X, g,3^ ^3^^^) be a cumulative interpreted inhibi- 
tion network agent, let a[x] ^ p[x] E C^: 

if3l\= o[x] (3[x\ then Cl{3^-%a)) - C/(T’^(o A /?)). 

Proof: 

We know that for all s E S there is a j G N with = 

Cl{3^'^{a A /?)). Since 31 N a[x] ^ P[x] it holds that Cl{3{a)) D 3{(3). 

Now let j G N s.t. = Cl{3^^%a A P)). 

We can show that Fja,o(^^^^>^{Cl{3^^'^{a))) = Cl{3^^^{a)), for: 
by def. of Fg* we have for all n E N \ {bias}: 

F3..o(„)(CTp^--(a)))(n) = 1 iff 
3^^^{a){n) = 1 or 
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3ni e = 1,th E n,-3ri2 G N{Cl{3^’°{a)){n2) = 1, 

U 2 I (ni,n))) 

iff, since F 3 co(,)(C/p^’"(a))) = Cl{3^-%a)), 

C/p^’^(a))(n) - 1; 

thus it follows for all n e N \ {6ias}; 

- 1 iff by def of , 

A /3)(n) = 1 or 

3m G iV(C/p"’^(a))(m) = l,m ^ n,-3m G A^(C/p"’^(a))(n 2 ) = 1, 

ri 2 I {ni,n))) 

iff, since A /?) = U 

3^'^{a){n) — 1 or IJ^’^(/?)(n) = 1 or 

3m G N[Cl{T^^{a)){ni) = l,m E n, -3n2 G N{Cl{3^'^{a)){n2) = 1, 
n 2 I (m,n))) 

because of what we have shown for Fjc,o(^^^{Cl{3^'^{a))) above, 
3^^^{f3){n) = 1 orC/p"’^(a))(n) = 1 iff, because Cl {3^^%a)) D 3^'^{(3), 
Cl{3^'^{a)){n) = 1 . 

So we have F^c,o(^^^ij^{Cl{3^^^{a))) = therefore also 

■^ 3 ^°(aA/ 3 )(^^(^^’^(^))) ~ and so by the property of input- 

determinedness from above: Cl{3^^^{a)) — Cl{3^'^^{a A /?)). ■ 

Lemma 241 ( C-Soundness I) 

Let 31 = {Sys,X, g,3^ ,3^^^) be a cumulative interpreted inhibition net- 
work agent: 

thenT'H^{3l) is a consistent conditional C-theory extending TH^ {31). 
Proof: 

Let X = {N, E, /, bias) be a finite inhibition net and 01 = {Sys, X, g, 3^, 
0^’^) be a cumulative net agent: 

1. Reflexivity, Left Logical Equivalence, Right Weakening: analogous to the 
proof of lemma 165; 

2. Cautious Cut: if 31 \= a[x] A P[x] ^ "y[x], 01 1= a[x] => /3[x], then 01 N 
a[x] ^ 7 [a;]; 

we know from lemma 2f0 that by assumption Cl{3^^^{a)) = Cl{3^'^{a A 
/?)). But also by assumption it holds that 0^’®(7) C Cl{3'^'^{a A /?)) and 
we are done. 

3. Cautious Monotonicity: if 31 \= a[x] ^ p[x], 31 a[x] => ^[x], then 
01 1= a[x] A /3[x] => ^[x]; 

for again by assumption and by lemma 240 we have that Cl{3^'^ {a)) = 
Cl{3^'^ {a/\^)) ; but by assumption we also know that3^^^{'y) C Cl{3^'^{a)), 
and we are done again. 



T'H^{31) is consistent by the same reasoning as in lemma 165. ■ 
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Lemma 242 Loop is not necessarily satisfied in cumulative interpreted inhi- 
bition network agents. 

Proof: 

A counterexample may he constructed from the cumulative (hut not 
cumulative- ordered) model described in Kraus et al[85], p.l88, in the same 
way as we construct networks from models in the proof of the completeness 
theorem 2fS below. ■ 

25.2.2 The Completeness Lemma for C 

Lemma 243 (C- Completeness I) 

Let TH-^ C be a theory, i.e. deductively closed: 
for every consistent conditional C-theory TH^ C extending 
there is a cumulative interpreted inhibition network agent 04 = {Sys,X, g,3^ , 

s.t. 

• TH^{m) D rn^, and 

• = TW^(04); i.e. for every a[x] => /3[x] G 

a[x] l3[x] e TH=^ iff^^ (^[x] => /3[x]. 



Proof: 

First we construct a network analogously as in the proof of lemma 1 69, 

i.e.: 

By theorem 10 f stated in chapter 11 (and proved by Kraus et al.[85], 
pp. 184-185) for every TH^ as above there is a finite cumulative model VJl'^ — 
{S,l, ~<) s.t. a[x] ^ l3[x] G ^ states minimal 

with respect to which make a true, also make (3 true. We use to construct 
the intended input- determined interpreted network 04. 

LetN = {6ia5}U*S. LetE = {{bias,s) |s is not minimal accordingto -< 
} U {(5,5') \s -< 5'}. For every s £ S let Lg = {s' G »S|s' ^ s}; say, Lg = 

|si , . . . , 

Now we define: 

Is = {{bias, (si, s)) , (si, (s 2 , s)) , . . . , (s^-_i, (s^-, s)) , {sr^, {bias, s))}. 

If s is minimal in -<, then let /^ = 0. Let I = U.es Ig. Obviously, I C N x E. 
Since X is finite (because S is), X is a finite inhibition net. 

We define for (p G C: = {bias} U (s |s does not make (p true}. 

It is obvious that is an interpretation mapping, except for Cl{{bias}) ^ N 
which follows from the subsequent considerations. Define N^, 3^, and g in a 
way, s.t. 3^'^{(p) = {bias}Ug{3^{(p)). X, g, 3^, determine a system Sys, s.t. 
04 = {Sys,X,g,3^ ,3^^^) is a cumulative interpreted inhibition network agent. 

This is the case since 04 is input- determined, i.e., there is a closure 
Cl{3^^^{a)) for all a e C, and Cl{3^^^{a))(s) = 0 iff s is a minimal a-state 
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according to 971^. From that it follows that ^)JVf ^ a[x] P[x] iffTt\=a[x] => 
P[x], which entails: a[x] P[x] G TH^ iffTl\= a[x] f3[x]. 

The proof method we apply is not identical to the one used in the proof 
of lemma 169, hut it might also he applied there. Let a E C, let s he a parameter- 
setting ofX. 

For all s G {a) we have that Fjc,o(^ct^{s){s) = 1 and also that s does 
not make a true in and is thus no minimal a-state according to 9Jl^. The 
same holds for where i ^ 1. 

So we can concentrate on the case where s ^ 3^'^ (a): 
hy the def. of Fs* we know that F^c,o(^a){^){^) = 3 iff 
3^'^{a){n) = 0, and -i3ni G N{s{ni) = l,ni E n, ->3n2 G N{s{n 2 ) = 
1,77-2 I i'ni.'n))) iff, since s ^ 3"^’^(a) hy assumption, 

-i 3 ni G N {s{ni) = l,ni E n,-i 3 n 2 G AT (5(77-2) = 1,77-2 / 

Now we distinguish between two subcases: 

1. s is a minimal a-state according to 9Jl^; 

thus all states in which are below s (if there are any) are no a-states. 
By the def. of 3^^^^ it follows that (a) 3 Ls = {s' G 5|5' ^ 5 }, and 
so for all s' G Lg and i ^ 1 we have ~ ^ from the above. 

But then every excitatory connection to s is inhibited in Fjc,o(^^^{s) for 
alii ^ 2 by the def. of I. Since s ^ (a) this implies: F^c,o(^f^^{s){s) = 0 

for all i ^ 2. 

2. s is no minimal a-state according to 9Jl^; 

in this case there is a Sj G Lg s.t. sj is a minimal a-state according 
to dJVf (this is by the smoothness of3Jl^). But we have just shown that 
~ ® i ^ 2. So there is an uninhibited excitatory 

connection to s for alii ^ 2 by the def. of E, and we have: F^c,o(^c^^{s){s) = 
1 for all i ^ 2. 

Summing up we have proved that for all s G S: Fjc,o(^^^{s){s) = 0 (for 
all i ^ 2) iff s is a minimal a-state according to But since this holds for 
all net parameter- settings s E S we have: there is a closure Cl{3*^'^{a)) for all 
a E C, and Cl{3^^^{a)){s) = 0 iff s is a minimal a-state according to 9Jt^. So 
we are done. ■ 

If we take soundness and completeness together we get again a repre- 
sentation theorem: 

Theorem 244 (C- Representation) 

Let TH-^ C be a theory: 

TFL^ C is a consistent conditional C-theory extending iff 

there is a cumulative interpreted inhibition network agent 31 = {Sys,X, g,3^ , 
3^^^), s.t. 2 TH^, and TH^ = TH^{31). 




366 



Cumulative Interpreted Inhibition Net Agents and the System C 



Proof: 

Lemma 241 proves the direction from the right to the left, lemma 2fS 
proves the direction from the left to the right. ■ 

We also have: 

Corollary 245 Let TH-^ C be a theory, a[x] => (3[x\ G C^, KB^ C C^: 

1. a[x] ^ /3[x] is true in all cumulative interpreted inhibition network agents, 

s.t. 5 TH^ iff for all consistent conditional C-theories T1~L^ 

extending a[x] ^ j3[x\ G TH^ 

2. a[x\ ^ (5[x\ is cumulative-net- valid iff for all consistent conditional C- 
theories TH^ extending the ded. closure dc{{T}) of {T}: a[x] => /3[x] G 

3. (C-Soundness/C- Completeness II) 

a[x] ^ P[x] is cumulative-net- valid ijf a[x] =4> P[x] is C-provable (rel. to 
dc{{T})) 

4 . KB^ a[x\ => j3[x\ ijf for all consistent conditional C-theories TB^ 5 
KB^, s.t. TB^ extends dc({T}); a[x\ ^ j3[x\ G TB^ 

5. (C-Soundness/C- Completeness III) 

KB^ ct[x\ ^ (3[x\ iff KB^ a[a:] ^ /3[x] 

Proof: 

1. apply theorem 244 ^ 

2. apply theorem 244 to 1 in definition 239; 

3. use 2 from above; 

4 . apply theorem 244 to 3 in definition 239; 

5. analogously to 5 in corollary 171. M 
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26.1 The Network Semantics for Simple Cumulative Interpreted 
Inhibition Net Agents 

Now we consider again the (input-determined) cumulative interpreted inhibi- 
tion network agents from the last chapter, but this time without inhibitory 
connections^ i.e., where 1 = 0. 

So we can define: 

Definition 246 (Simple Cumulative Network Semantics) 

1. a[x]^ I3[x\ € is called simple- cumulative-net-valid iff 

for every cumulative interpreted inhibition network agent 91 without in- 
hibitory connections: 91 1= o;[x] => j3[x\ 

2. let KB^ C C^, let 91 = {Sys^X^ be a cumulative interpreted 

inhibition network agent without inhibitory connections: we say that 

31^ KB^ iff 

for every a[x] => /3[x] G KB^ it holds that 91 1= o[x] P[x] 

3. let KB^ C C=^, let a[x] ==> P[x] e C^: 
we say that 

KB^ a[x] ^ f5[x] 

(KB^ simple- cumulative-net-implies a[x] P[x]) iff 

for every cumulative interpreted inhibition network agent 91 without in- 
hibitory connections: 
i/91 1= KB^, then 91 N a[x] ^ (3[x\. 



We call this semantics ‘simple cumulative net semantics’. 
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26.2 The Representation Theorem for CM 

26.2.1 The Soundness Lemma for CM 
Lemma 247 (CM-Soundness I) 

Let ^ = {Sys,T^ be a cumulative interpreted inhibition net- 

work agent without inhibitory connections: 

then TH^ (31) is a consistent conditional CM-theory extending TH-^ {31). 

Proof: 

We get C by the lemma 24 L Monotonicity is entailed by remark 131. 



26.2.2 The Completeness Lemma for CM 
Lemma 248 (CM- Completeness I) 

Let TTL-^ C C-^ be a theory, i.e. deductively closed: 
for every consistent conditional CM-theory TTL^ C extending 
TTL^ there is a cumulative interpreted inhibition network agent 31 = {Sys,X,g, 
yp inhibitory connections, s.t. 

• TH^{31) 2 TH-., and 

• TTL^ = TTL^{3\), i.e. for every a[x] f5[x] G C^: 

a[x] ^ f3[x] G iff3l\= a[x] ^ P[x]. 



Proof: 

Analogous to the proof of lemma 243. 

We use theorem 108 stated in chapter 11 (and proved by Kraus et 
al. [85], pp. 201-202), i.e. for every TH=^ as above there is a finite simple 
cumulative model 3R'^^ = (S,l,^) s.t. a[x] j3[x] G TH^ iff 3JVfc N a[x] 
/3[x]. m 



If we take soundness and completeness together we get the correspond- 
ing representation theorem: 

Theorem 249 (CM- Representation) 

Let TH-^ C be a theory: 

TTL^ C is a consistent conditional CM-theory extending TTi^ iff 
there is a cumulative interpreted inhibition network agent = {Sys,X,g,3^ , 
3^'^) without inhibitory connections, s.t. T1~L^{31) 2 XLi-,, and TTL^ — 
TH^{31). 

Proof: 

Lemma 24'^ proves the direction from the right to the left, lemma 2f8 
proves the direction from the left to the right. ■ 
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We also have: 

Corollary 250 Let TH-^ C he a theory, a[x] /3[x] G C^, KB=^ C 

1. a[x] f3[x] is true in all cumulative interpreted inhibition network agents 

without inhibitory connections, s.t, D TTL^ iff for all consistent 

conditional CM-theories TH^ extending T7Y_>: a[x] ^ P[x] G TH^ 

2. a[x] ^ (3[x] is simple- cumulative-net-valid iff for all consistent condi- 
tional CM-theories TH^ extending the ded. closure dc({T}) of {T}; 
a[x] => /3[x] G TH^ 

3. (CM-Soundness/CM-Completeness II) 

a[x] => fl[x] is simple- cumulative-net-valid iff a[x] => /3[x] is CM-provable 
(rel. to dc{{T})) 

4 . KB^ a[x] f3[x\ iff for all consistent conditional CM-theories 

TH^ 2 KB^, s.t. TH^ extends dc{{T}): a[x] => P[x] G TH^ 

5. (CM- Soundness /CM- Completeness III) 

KB^ a[x] P[x] iffKB^ “W ^ 

Proof: 

1. Apply theorem 249; 

2. apply theorem 249 to 1 in definition 246; 

3. use 2 from above; 

4 . apply theorem 249 to 3 in definition 246; 

5. analogously to 5 in corollary 171. W 




Chapter 27 

SIMPLE PREFERENTIAL INTERPRETED INHIBITION NET 
AGENTS AND THE SYSTEM M 



27.1 The Network Semantics for Simple Preferential Interpreted 
Inhibition Net Agents 

Now we consider net agents based on interpreted inhibition networks which are 
antitone and which do not have inhibitory connections. By ‘antitone’ we mean 
here: the net J is such that N is antitone in J. The property of being antitone 
together with the lack of inhibitory connections directly entails that in such 
networks every excitatory connection from a node which is not the bias, to 
another node is superfluous. Thus, in such nets the “essential” excitatory con- 
nections lead from the bias to other nodes. A node gets excited either directly 
by the input or by the bias node. These networks are of course trivial but they 
may be used in order to complete our intended list of representation results. 

So we define, finally: 

Definition 251 (Simple Preferential Network Semantics) 

1. a[x]^ fl[x\^ is called simple-preferential-net-valid iff 

for every preferential antitone interpreted inhibition network agent 01 
without inhibitory connections: ^\= a[x] => f3[x] 

2. let KB=^ C C^, let 01 = {Sys,X,g,3'^,3^^^) be a preferential antitone 
interpreted inhibition network agent without inhibitory connections: we 
say that 

01 N KB^ iff 

for every a[x] ^ j3[x\ G KB^ it holds that 01 N a[x] fl[x] 

3. let KB^ C let a[x] => j3[x] G C^: 
we say that 

KB^ a[a;] /3[x] 

(KB=^ simple-preferential-net-implies a[x\ => (3[x]) iff 
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for every preferential antitone interpreted inhibition network agent ^ 
without inhibitory connections: 
i/0^ 1= KB=^, then N a[x] (3[x\. 

We call this semantics ‘simple preferential net semantics’. 



27.2 The Representation Theorem for M 

27.2.1 The Soundness Lemma for M 
Lemma 252 (M-Soundness I) 

Let be a preferential antitone interpreted inhi- 

bition network agent without inhibitory connections: 

thenTTt^ifyi) is a consistent conditional M-theory extending TTL-^ifTi) . 

Proof: 

This is a consequence of lemmata 233 and 2f 7. ■ 

27.2.2 The Completeness Lemma for M 

Lemma 253 (M- Completeness I) 

Let TH-^ C be a theory, i.e. deductively closed: 
for every consistent conditional M-theory TTL=^ C extending TH-^ 
there is a preferential antitone interpreted inhibition network agent ^ = {Sys,X, 
without inhibitory connections, s.t. 

• 3 TH^, and 

• = TTL^{31), i.e. for every a[x] ^ P[x] e C^: 

a[x\ (3[x\ G TH^ iffXl^ a[x] (5[x\. 



Proof: 

Again analogous to the proof of lemma 2f3. 

We use theorem 109 stated in chapter 11 (and proved by Kraus et 
al. [85], pp.203, although this is actually just the completeness theorem for 
classical propositional logic), i.e. for every as above there is a finite simple 

preferential model = {S,l, -<) s.t. a[x] /3[x] G TH^ iJfdJl^p N a[x] ^ 

(3[x\. ■ 



If we take soundness and completeness together we get the representa- 
tion theorem: 

Theorem 254 (M- Representation) 

Let TH^ C be a theory: 

TTi^ C is a consistent conditional M-theory extending TTL^ 
iff there is a preferential antitone interpreted inhibition network agent = 
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(Sys,X, without inhibitory connections, s.t. Tl~L^[Ti) 2 TH-^, and 

Proof: 

Lemma 252 proves the direction from the right to the left, lemma 253 
proves the direction from the left to the right. ■ 

So we have: 

Corollary 255 Let TH-^ C he a theory, a[x] => /3[x] 6 C^, KB^ C C=^: 

1. a[x]^ j3[x] is true in all preferential antitone interpreted inhibition net- 
work agents without inhibitory connections, s.t. D TTL^ iff 

for all consistent conditional M-theories T1~L^ extending T1~L^: a[x] 
(3[x] e 

2. a[x] ^ p[x] is simple-preferential-net-valid iff for all consistent con- 
ditional M-theories extending the ded. closure dc({T}) of {T}; 

a[x] /3[x] e TH^ 

3. (MS oundness/M- Completeness II) 

a[x] l3[x] is simple-preferential-net- valid iff a[x] P[x] is M-provable 
(rel. to dc({T }) ) 

4 . KB^ a[x] /3[x] iff for all consistent conditional M-theories TH^ 
5 KB^, s.t. TH^ extends dc({T}); a[x] ^ /3[x] G TH=^ 

5. (MS oundness/M- Completeness III) 

KB^ a[x] ^ !3[x\ iff KB^ a[x] ^ (i[x] 

Proof: 

1. apply theorem 254; 

2. apply theorem 254 io 1 in definition 251; 

3. use 3 from above; 

4 . apply theorem 254 3 in definition 251; 

5. analogously to 5 in corollary 171. M 
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